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Abstract 


In almost all numerical simulations matrix inversion is the most recursive and 
computationally intensi\'e step. Therefore, the choice of matrix inverter, or equiv- 
alently the linear system solver is a critical step towards reducing the compu- 
tational time of the overall simulation. Conjugate Gradient method is an ex- 
tremely effective technique for solving linear systems when the coefficient matrix 
is symmetric and positive definite. Generalization and efficient implementation 
of conjugate gradient method for the linear systems with unsymmetric coefficient 
matrices is still an active area of research. One such generalization, based on pre- 
conditioning with incomplete lower-upper (ILU) decomposition of the coefficient 
matrix, has been proposed by Kershaw [1978]. Following this generalization, She- 
orey [2001] has developed a Preconditioned Conjugate Gradient (PCG) algorithm 
that is believed to work well for the case of any nonsingular coefficient matrix. 

The aim of the present research is to assess the above mentioned PCG algo- 
rithm for matrix inversion in various test cases including the complex problem 
of Czochralski crystal growffh simulation. Czochralski technique is quite popular 
for growing single crystals that are of great importance for modern electronic and 
electro-optical applications and devices. Performance of these devices is deleteri- 
ously inffuenced by the presence of solute inhomogeneities in the crystals used. In 
order to investigate the solute segregation in these crystals, the coupled problem 
of heat transfer, hydrodynamics and solute conservation must be simultaneously 
solved. This research presents a macroscopic model for solute segregation, 

CPU time analysis for the test case of unsteady-nonlinear heat conduction 
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iii 

shows that PCG is extremely effective for matrix inversion as compared to Gaus- 
sian elimination and Gauss-Seidel iterations. PCG becomes an even more at- 
tractive choice on finer grids. Further PCG is found to be well suited for matrix 
inversion in advection-diffusion problems. The results obtained for the fully devel- 
oped Nusselt number in channel flow matches closely with the analytical solution 
reported in the literature. A comparison of the eigenvalue spectra of the original 
and preconditioned matrices show that preconditioning with incomplete lower- 
upper (ILU) decomposition is very effective in lowering the condition number 
towards unity. Moreover, this preconditioning technique is general in the sense 
that it can be constructed and applied to any nonsingular coefficient matrix. 

For growth of a NdiYAG crystal from its melt, the PCG algorithm has been 
used to determine the quasi-steady flow and temperature fields. In addition, the 
distribution of Nd particles in YAG has been determined by a truly unsteady 
calculation. Results obtained in this part of the study show the following: 

In the simulations of Czochralski crystal growth, the pull velocity is varied 
with time to grow constant diameter crystals. The results obtained for pull- 
velocity variation show that it is not possible to maintain favorable growth con- 
ditions over a longer growth period. This is because the pull velocity continu- 
ously decreases with time. After a small increase in crystal length, melting starts. 
Therefore, to grow longer crystals with constant diameter, other parameters such 
as the crucible temperature and the enclosure temperature need to be varied, 
while keeping pull velocity positive and a constant. Redistribution of the seg- 
regated solute is found to correlate with the complex flow patterns in the melt. 
Profiles of the radial solute distribution reveal that crystal rotation improves ra- 
dial uniformity of solute concentration in the grown crystals, while the uniformity 
along the crystal length is unaffected. 
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Chapter 1 
Introduction 


In all numerical simulations, the result of the discretization process is a system 
of algebraic equations, which can be linear or non-linear according to the nature 
of the partial differential equation(s) that govern the physical process of interest. 
In the non-linear case, the discretized equations must be solved by an iterative 
technique that involve guessing a solution, linearizing the equations about that 
solution, and improving the solution until a converged result is obtained. More- 
over,, if the mathematical problem is unsteady, these nonlinearity iterations need 
to be carried out at every time step. Thus in numerical simulations, matrix in- 
version is the most recursive and computationally intensive step and therefore, 
the choice of matrix inverter, or equivalently the linear system solver is a critical 
step towards reducing the computational time of the overall simulation. 

Matrix inverters can be classified as direct and iterative. Direct methods are 
usually a variation of the Gaussian elimination technique, making use of forward 
elimination and backward substitution. It is equivalent to the decomposition of 
the coefficient matrix into lower and upper triangular factors. Direct inverters in- 
volve larger number of arithmetic operations per inversion and hence carry large 
round-off errors. Moreover, direct solvers when applied to a sparse system, in- 
troduce a large number of additional nonzero entries into, the coefficient matrix. 
These ^'fill-in” values destroy the sparse structure of the coefficient matrix and 
thus increases the storage requirements significantly. However direct methods 
obtain the solution in finite number of predetermined steps and are generally 
stable. On the other hand, the term ‘'iterative methods" refers to a wide range of 
techniques that use successive approximations to obtain more accurate solutions 
to a linear system at each step. The computational effort per inversion of an 
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iterative technique depends on its convergence rate. The rate of convergence of 
these methods depends greatly on the eigenvalue spectrum of the coefficient ma- 
trix. Hence, iterative methods usually involve a second matrix that transforms 
the coefficient matrix into one with a more favorable spectrum. This technique 
is called ''preconditioning' and the transformation matrix is called a "precondi- 
tioned' . A good preconditioner improves the convergence rate of an iterative 
method, sufficiently to overcome the extra cost of constructing and applying the 
preconditioner. When using an iterative method, the coefficient matrix is in- 
volved (directly or indirectly) only in terms of matrix by vector products. Thus 
there is no need for storing the zero elements of the coefficient matrix, and there- 
fore, it is possible to implement the algorithms that are extremely cost-effective 
with respect to computing time as well as storage. In addition of the matrix by 
vector product, many iterative solvers are based on the vector operations like the 
calculation of a inner product. Thus except for some possible problems related to 
the global communication needed for the computation of inner products, such it- 
erative methods are well suited for parallel and vectorized implementations. But 
then iterative methods are only conditionally stable. To combine the benefits of 
both the direct and iterative approaches, hybrid methods have been developed. 
The "Preconditioned Conjugate Gradient (PCG)" is one such technique that has 
been assessed in the present study. 

1.1 Literature Review on Matrix Inversion 

Lanczos [1952] and Hestenes and Stiefel [1952] discovered the method of Con- 
jugate Gradients (CG) for solving linear systems with symmetric and positive 
definite coefficient matrix A. This method was initially regarded as an exact (di- 
rect) solution method, guaranteed to converge in no more than n steps, where n 
is the dimension of the problem. More precisely, in exact arithmetic the method 
is guaranteed to compute the solution to Ax = b in exactly m steps, where m is 
the degree of minimal polynomial of A, i.e., the number of distinct eigenvalues of 
A (since A is assumed to be symmetric). Based on numerical experiments it was 
observed that in presence of rounding errors, a few more than n iterations were 
often necessary to obtain an accurate solution, particularly on ill-conditioned 
systems. On the other hand, Hestenes and Stiefel [1952] observed that on well 



1.1 Literature Review on Matrix Inversion 


3 


conditioned systems, satisfactory approximations could be obtained in far fewer 
than n iterations, making the Conjugate Gradient method an attractive alter- 
native to Gaussian Elimination. In spite of this, for a number of reasons the 
Conjugate Gradient method saw relatively little use for almost 20 years, during 
which Gaussian Elimination (for dense matrices) and SOR and Chebyschev iter- 
ation (for sparse matrices) were usually the preferred solution methods. Great 
details of these developments can be found in the interesting paper on the history 
of conjugate gradient method by Golub and O’Leary [1989]. 

This situation finally changed around 1970 with the publication of a paper 
of Reid. Reid [1971] shown that for large sparse matrices that are reasonably 
well conditioned, the Conjugate Gradient method is a very powerful scheme that 
yields good approximate solutions in far fewer than n steps. Reid’s paper set the 
stage for the rehabilitation of the Conjugate Gradient method and stimulated 
research in various directions. On one hand, extensions to linear systems with 
indefinite and/or nonsymmetric matrices were sought; on the other, techniques 
to improve the conditioning of linear systems were developed in order to improve 
the rate of convergence of the method. 

Early ideas of preconditioning the Conjugate Gradient method can be found 
in Engeli et al. [1959], and occasionally in several papers published during 1960s 
(Golub and O’Leary [1989]). A detailed study of SSOR preconditioning as a 
means of accelerating the convergence was carried out in by Axelsson [1972]. 

A measure breakthrough took place around the mid-1970s, with the intro- 
duction by Meijerink and van der Vorst of the incomplete Cholesky-Conjugate 
Gradient (ICCG) algorithm. Incomplete factorization methods were introduced 
for the first time by Buleev in the then-Soviet Union in the late 1950s, and inde- 
pendently by Varga in 1970s. However, Meijerink and van der Vorst [1977] deserve 
credit for recognizing the potential of incomplete factorizations as preconditioners 
for the Conjugate Gradient method. Another paper that did much to popularize 
this approach is by Kershaw [1978]. Since then, a number of improvements and 
extensions have been made, including level-of-fills and drop tolerances-based in- 
complete factorizations, generalizations to block matrices, modified and stabilized 
variants and even (more recently) efficient parallel implementations. 

Then in late 1970s, an extensive research was under way for generalized CG 
methods that could be applied to nonsymmetric and/or indefinite problems. By 
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the early 1980s there was a new plethora of new versions of generalized conjugate 
gradient methods. However, none of these methods for general nonsymmetric 
problems worked in practice as well as the CG method did for symmetric positive 
definite problems. For some problems either a very large number of iterations were 
required or the method did not even converge. In his work Kershaw also proposed 
a generalization of the Conjugate Gradient algorithm for any non-singular coeffi- 
cient matrix arising from partial differential equations. Sheorey T. [2001] followed 
this generalization and developed an incomplete LU preconditioned version of the 
conjugate gradient method that is applicable for the case of any non-singular co- 
efficient matrix. The same version of PCG has been assessed in the present study. 
Some applications of this PCG algorithm can be found in Sheorey et al. [2001], 
Sheorey and Muralidhar [2001, 2003]. 

Books by Axelsson [1994], Barrett et al. [1993], Saad [1996], Golub et al. 
[1996] and Watkins [2002] provide excellent references for iterative matrix in- 
version in general and PCG in particular. Pissanetzky [1984] provides detailed 
discussions of almost all popular storage formats for sparse matrix handling. 

1.2 Application Problem: Czochralski Crystal 
Growth 

The growth of semiconductor single crystals is the basis for device fabrication in 
microelectronics industry. Single crystals of oxides have also become important 
materials for high power laser applications. These oxide crystals are found to have 
special optical, fluorescent, electrical and mechanical properties, and therefore, 
these crystals are finding new applications in a vast field of ferroelctric, pyroelec- 
tric, piezoelectric , acoustic, magnetic and opto-magnetic devices. Most of these 
applications require a high level of microstructural and chemical perfection that 
only can be exhibited by single crystals. 

Although several methods have been devised for growing single crystals, the 
Czochralski (Cz) method has virtually dominated the entire production of single 
crystals. The popularity of this method comes from its ability to meet the strin- 
gent requirements for purity, doping, electrical and mechanical properties, and 
crystallographic perfection. 

The basic principle of growing a single crystal by the Czochralski method 
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is shown in Figure 1.1. A crucible is initially filled with the material and then 
thermal energy is supplied by a heater surrounding the crucible. This raises 
the temperature of the solid above its melting temperature. A seed crystal, at 
the end of the rod, is then dipped into the melt and, after appropriate start- 
up procedures, slowly withdrawn from the melt. To keep the feeding material 
molten, the crucible is maintained at a constant temperature above the melting 
point. 

Under suitable temperature conditions re-crystallization in the form of a single 
crystal occurs. This method bears Czochralski’s name, who in 1916 established 
it as a means to determine the crystallization velocity of metals. 


Pulj rate 



Figure 1.1: A schematic of the basic principle of Czochralski crystal growth pro- 
cess. 

Convection in Czochralski melt is pervasive in all terrestrial' growth systems. 
Sources for flow include buoyancy-driven convection caused by the solute and 
temperature dependence of the density; surface tension gradients along fluid- 
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melt menisci; forced convection introduced by the motion of solid surfaces, such as 
crucible and crystal rotation in the Cz systems; and motion of the melt induced by 
the solidification of the material. These flows are important causes of convection 
of heat and species and can have a dominant influence on the temperature field in 
the system and on solute incorporation into the crystal. A schematic of various 
transport mechanisms of Czochralski process is shown in Figure 1.2. 



Figure 1.2: Schematic of various transport mechanisms in a Czochralski system. 

A thorough understanding of heat and mass transport is a prerequisite to the 
optimization of Czochralski crystal growth systems for control of crystal quality, 
as measured by the degree of crystallographic perfection of the lattice and the 
spatially uniformity of electrically active solutes in it. Fluid motion in Cz-system 
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is of concern to crystal growers as it determines the transport of solute , heat, 
impurities, etc. to the grown crystal. Magnitude of stresses in the grown crystal is 
governed by the temperature gradients in the melt. It is currently accepted that 
the transport properties of the melt are the primary factors affecting the quality 
of grown crystals. These properties of melt are themselves determined by a broad 
and interacting matrix of variables that include crystal rotation, crucible rotation, 
pull rate, crucible temperature, pressure, orientation etc. Thus, state-of-the-art 
crystal growth requires algorithms that models the growth by incorporating the 
effects of all parameters and gives a set of values of these parameters as best 
possible guidelines for growing a particular crystal. 

1.3 Literature Review on Crystal Growth 

Over the past two decades, numerical simulations of Czochralski (Cz) crystal 
growth have been performed with varying degrees of complexity. Both finite 
difference and finite element methods have been used to simulate first the two 
dimensional and then the three-dimensional melt flows with many different kind of 
boundary conditions. A regular cylindrical geometry has usually been considered 
in most of the cases. Recently, efforts have been devoted to account for the 
crucible bottom curvature. Finite element methods can easily handle the non- 
flatness of the crucible bottom. However, this task becomes difficult if a regular 
finite difference or finite volume method is used. Consideration of crucible shape 
is important because the melt in industrial Cz(Si) process can become quite 
shallow. Also as the crystal is pulled, the aspect ratio of the melt continuously 
decreases. 

Other difficulties in modeling realistic Cz flows arise from the fact that the 
positions of the melt-crystal interface and melt meniscus are neither known a 
priori nor planar. The melt-crystal interface coincides with the freezing point 
isotherm in the system (for a pure material), and the melt meniscus results from 
a force balance between the surface tension, pressure force and viscous effects. 
To complicate this further, the flow and temperature fields which are used to 
determine the shapes and locations of interface and meniscus are often oscillatory 
in nature. To overcome the difficulties associated with the time varying domains 
and irregular boundaries, different techniques have been employed by various 
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investigators. 

Derby and Brown [1986] have developed an ’’integrated hydrodynamic ther- 
niocapillary” (IHTGM) model based on the finite element method which allows 
the calculation of interface as well as meniscus shape. The IHTCM models heat 
transfer bj'- conduction and diffuse, gray radiation between all components within 
the Cz puller and axisymmetric, steady-state, laminar flow in the melt. The ra- 
dius of the crystal is computed simultaneously and self-consistently with the heat 
transfer throughout the system and convection in the melt. Kinney and Brown 
[1993] have incorporated k-e model for turbulent flow into the IHTCM. More re- 
cently, a global conduction-radiation procedure based on a finite element program 
ABAQUS, and a thermal radiation model, combined with a three-dimensional 
Navier-Stokes solver based on the finite element method has been developed by 
Koai et al. [1994]. 

In the area of finite differences Kopetsch [1989, 1990] extended a numerical 
procedure to include the deformable shapes of the melt-crystal interface and melt- 
ambient meniscus. His numerical scheme is based on an adaptive grid generation 
that maps the irregular boundary/interface movement and generates a coordinate 
transformation for a curvilinear domain. Several papers have been published in 
this direction and many of the problems faced in the formulation have been 
resolved. For example, the original scheme is suitable only for a single domain 
and steady-state problems. It is however, very difficult to use this formulation for 
a multizone system since it needs an adaptive grid generation algorithm in the 
interface zone. Also, the algorithm is not suitable for transient processes since it 
lacks the relationship between the grids at different times. 

Zhang et al. [1995] developed a multizone adaptive grid-generation (MAGG) 
technique for efficient simulations of various transport phenomena on irregularly 
shaped domains and in regions with moving and/or free boundaries. MAGG 
is based on a variational optimization approach and provides boundary-fitting 
capability and discretionary control over grid distribution, as well a local and 
global orthogonality. Based on this grid generation technique and curvilinear 
finite volume discretization, Prasad et al [1995] developed a high resolution 
computer model (MASTRAPP2d) to simulate crystal growth processes at low 
and high pressures. This scheme allows consideration of a multiphase system for 
more than one material in one single domain which may consist of irregular and 
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moving boundaries, interfaces and free surfaces. 

Solute segregation in the Cz systems is coupled with melt convection, species 
diffusion and the physical and chemical properties of impurities in melt and solid. 
An accurate simulation of segregation remains a difficult task. The major chal- 
lenge in predicting the solute distribution in the crystal stems from the fact that 
solutal transport in melt is an intrinsically dynamic process which is strongly 
time dependent. Thus dopant distribution cannot be treated as quasi-static like 
flow and temperature fields, which do not change much in a small time interval 
since the crystal rates are generally very small. 

The equilibrium segregation coefficient /cq is applicable only for negligibly slow 
growth rates. For finite or higher solidification rates, solute atoms with ko < 1 
are rejected by the advancing solid at a greater rate than they can diffuse into the 
melt. Many attempts therefore have been made to obtain appropriate values of 
the efi'ective segregation coefficient. For ID (longitudinal) steady-state diffusion 
of a solute, Hurle [1993] gives 


|1 - (1 - fco) J] 

where 


( 1 . 1 ) 


J = 


D 




( 1 . 2 ) 


and Us is the rotation rate for the crystal and H is a rotation parameter. 

Burton et al. [1953] assumed that solute distribution varies within a thin 
layer of thickness 6 adjacent to solid-melt interface. A complete mixing caused 
by convection is assumed outside this boundary. They related the parameter J 
to the boundary layer thickness 6 by 


M = -l„(i _ J) (1,3) 

This approach was criticized by Wilson [1980], who has demonstrated that the 
relationship should be 
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D 


(1.4) 


These two definitions (Equation 1.3 and Equation 1.4) approach each other as 
J becomes very small compared with unity. An evaluation of the integral in 
Equation 1.2 then yields 


5 = (1.5) 

This gives the famous BPS model as 


u _ 22 (I 3] 

[^0 + (1 + ko)exp{-Vp6/D)] 

where it has been assumed that = pi. 

In the limit when (5^0, kg/f becomes equal to ko\ that is, the boundary 
layer vanishes due to very strong convective flows in the melt. When 5 — >■ co, 
the effective segregation coefficient approaches unity, that is, the incorporation 
of solute in the solid is equal to that in the melt. 

A detailed discussion on microsegregation in the formalism of the boundary 
layer has been presented by Favier et al. [1982]. They have decoupled the convec- 
tive and diffusive material transport and have assumed convection to be steady 
and diffusion to be time dependent. This assumption is interesting and quite 
questionable since it has been observed that unsteady convection is the cause of 
change of crystal radius. 

Garandet et al. [1994] demonstrated that the results of Favier’s and Wilson’s 
models are quite close and also are in agreement with the theoretical model of 
Hurle et al. [1969]. But the segregation in lateral direction cannot be treated 
by a ID model. Experimental results of Chung et al. [1997] clearly demonstrate 
that the solute concentration is a strong function of radial direction because of 
the strong convective effects in the melt and non-planar crystal-melt interface. 
Solute concentrations based on dynamic simulations from beginning of growth 
until the end have never been reported. 
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1.4 Scope of the Present Work 

In summarizing the previous work on Conjugate Gradient methods, it has been 
realized that it is an extremely effective technique for solving linear systems when 
the coefficient matrix is symmetric positive definite. Generalization and efficient 
implementation of Conjugate Gradient technique for the linear systems with un- 
symmetric coefficient matrices is still an active area of research. Preconditioned 
Conjugate gradient (PCG) algorithm developed by Sheorey [2001] is one such 
generalization and present research is an assessment of this version of the PCG 
algorithm. The algorithm has been examined for the following test problems. 

1. Two dimensional unsteady heat conduction in a region with variable ther- 
mal conductivity; 

2. Two dimensional unsteady advection-diffusion problem of flow between two 
infinite parallel plates; 

3. Momentum, energy and solutal transport equations arising in the modeling 
of Czochralski crystal growth. 

Effect of the preconditioning technique used i. e. , incomplete lower-upper (ILU) 
preconditioning has been studied through the eigenvalue spectrum and condition 
numbers. In addition, solutal transport has been studied with a special interest 
in understanding solute segregation in the Czochralski crystal growth process. 

1.5 Thesis Organization 

The present thesis has been organized in the following manner: 

1. Chapter 1 presents introduction, literature review and scope of the present 
research. 

2. Chapter 2 presents mathematical background of the Conjugate Gradient 
method and then briefly summarizes various preconditioners. 

3. Chapter 3 presents some numerical experiments with PCG. 

4. Chapter 4 presents mathematical formulation for modeling transport phe- 
nomena in Czochralski crystal growth processes. 
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5. Chapter 5 gives a numerical methodology for solving the partial differential 
equations governing Czochralski crystal growth, 

6. Chapter 6 carries conclusions of this study and scope of future work. 


Chapter 2 

Conjugate Gradient Algorithm: 
Mathematical Background and 
Preconditioning 


In this chapter, the steepest descent method for solving linear systems is dis- 
cussed. The powerful Conjugate Gradient(CG) algorithm is then presented as a 
variation on the Steepest Descent algorithm. The standard conjugate gradient 
algorithm (that is applicable only when the coefficient matrix is symmetric and 
positive definite) has been generalized to the case of any nonsingular coefficient 
matrix. The general CG algorithm thus obtained is then preconditioned with 
incomplete lower-upper(LU) decomposition of the coefficient matrix. Various 
preconditioners are finally summarized. 

2.1 Descent Methods 

If A is symmetric and positive definite then the problem of solving the linear 
system Ax = b can be reformulated as a minimization problem. Define a function 
J ; 7^" 7^ by 

J{y) = \y^Ay-y^b 


(2.1) 
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Then the vector that minimizes J is exactly the solution of Ax = h. The can be 
seen by writing the function J as 

J{y) = ^y’^Ay -y'^Ax 

= ^y'^Ay — y'^Ax + ^x^Ax - ^x'^Ax 

^ Zj ^ 

= ( 2 - 2 ) 

The term ^x'^Ax is independent of y, J{y) is minimized exactly when |(y — 
xfA{y-x) is minimized. Since A is positive definite, this term is positive unless 
y — X = 0. Thus it takes minimum value when and only when y — x. 

Descent methods solve Ax ~ bhy minimizing J. These are iterative methods. 
Each descent method begins with an initial guess and generates a sequence 
of iterates x^'^\ x^^\ ... such that at each step J < J {x^^'^), 

and preferably J < J In this sense we get closer to the minimum 

at each step. If at some point we have Ax^'^'^ = b ov nearly so, we accept 
as the solution. The step from x^’^^ to has two ingredients: (i) choice of a 
search direction, and (ii) a line search in the chosen direction. Choosing a search 
line amounts to choosing a vector that indicates the direction in which we 
will travel to get from x^'^^ to x^*"''^). Once a search direction has been chosen, 
a;(^+i) will be chosen to be a point on the line {x^^^ + | a £ 7^}. Thus we 

will have 


2,(fc+i) ^ ^ c^kP^k) (2.3) 

for some real ak- The process of choosing ak from among all Ofc £ 7^ is called 
line search. Here ak is chosen is such a way that J < J (x^^^). One way 

to ensure this is to choose ak so that J = minaenJ + ap^'^^). This 

way of choosing ak is called exact line search and it gives 




( 2 . 4 ) 


where ~b — Ax^^\ 
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2.1.1 Steepest Descent 

The method of steepest descent takes and performs exact line searches. 

Since the search direction is the direction of steepest descent 

of J from point 

To search in the direction of the steepest descent is a quite natural idea but 
unfortunately, it does not work particularly well. Nevertheless, steepest descent is 
worth studying for at least two reasons: (i) It is a good vehicle for introducing the 
idea of preconditioning, (ii) Minor changes turns the steepest descent algorithm 
into the powerful conjugate-gradient method. 

It is a simple matter to program the steepest descent algorithm. At each 
step approximate solution is updated using Equation 2.3 where values of ak are 
calculated using Equation 2.4. Calculation of cxk requires, among all other things, 
multiplying the matrix A by the vector Cost of this operation depends on 
the degree of sparsity of A. In many applications the matrix-vector product is 
the most expensive step of the algorithm and the number of these steps needs 
to be minimized. Calculation of residual = b — Ax(fc) also seems to require 
an additional matrix-vector product Ax^k) and this can be avoided by a simple 
recursion formula 


^{fc+i) ^ ^(k) _ c,^j^p{k) (2.5) 

which is a consequence of Equation 2.3, to update the residual from one iteration 
to the next. Now the matrix-vector product Ap^^^ need not be calculated again 
as it has been already calculated as part of computation of cuk. A' prototype of 
steepest descent algorithm is given below (Watkins [2002]). 

Algorithm 2.1: Prototype Steepest Descent Algorithm 

= 5 — Ax^^^ 
p(o) = 

for A: = 0, 1, 2, ... 

p(k) ^ ^p(k) 

af. = 

^(fc+l) _ ^(fc) _j_ (Xj^pW 
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^(Ar+l) ™ __ 

p(fc+l) _ ^(*:+l) 

2.1.2 Geometric Interpretation and Convergence 

The objective of the descent method is to minimize the function J{y)- From 
Equation 2.2 one can see that J is of the form 

J{y) ^) - 1 (2-6) 

where x is the solution of Ax = 6, and 7 is a constant. Since A is symmetric 
there exist an orthogonal matrix U such that U'^AU is a diagonal matrix A. The 
main diagonal entries of A are eigenvalues of A, which are positive. Introducing 
new coordinates 2 = U'^{y — x) and dropping the inessential constant 7, we see 
that minimizing J{y) is equivalent to minimizing 

n 

J (z) - z^Kz - ^ XiZi^ 

i=l 

To get a picture of function J, consider a 2 x 2 case 
two variables, so its contours or level surfaces J{zi^Z2) - 
which have the following form 

AiZi^ + X2Z2^ — c, 

which are concentric ellipses centered on the origin. The orthogonal coordinate 
transformation z = U'^{y — x) preserves lengths and angles,- so the contours of J 
are also ellipses of the same shape. Such contours for the 2 x 2 case are shown in 
Figure 2.1. 

The semiaxes of the ellipses are \/c/Xi and a/c/A 2, whose ratio is y/'hJXy. 
Eigenvalues of a positive definite matrix are the same as its singular values, so the 
spectral condition number is equal to the ratio of largest to smallest eigenvalue. 
Thus the shape of the contours depends on the condition number of A and greater 
the condition number, the more eccentric the ellipses are. 

The dotted lines in Figure 2.1 represents the four steps of the steepest descent 
algorithm. For a given point, the search proceeds in the direction of the steepest 
descent, which is orthogonal to the contour line. The exact line search follows 


( 2 . 7 ) 

Now j is a function of 
c are curves in a plane 
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the search line to the point at which J is minimized. J decreases as long as the 
search line cuts through the contours. The minimum occurs at the point at which 
the search line is tangent to a contour. Since the next search direction will be 
orthogonal to the contour at that point, each search direction is orthogonal to 
the previous one. Thus the search bounces back and forth in the canyon formed 
by the function J{y) and proceeds steadily towards the minimum. 

The minimum is quickly reached if A is well conditioned. In the best case, 
Ai = Aa, the contours will be circles, the direction of steepest descent (from any 
starting point) points directly to the center, and the exact minimum is reached 
in one iteration. If A is well conditioned, contours will be nearly circular and 
the direction of steepest descent will point close to the center, and the method 
will converge rapidly. If, on the other hand, A is somewhat ill conditioned, the 
contours will be highly eccentric ellipses. Prom a given point, the direction of 
steepest descent is likely to point nowhere near the center and it would not be a 
good search direction. In this case the function J forms a steep, narrow canyon 
and the algorithm bounces back and forth in this canyon. It takes very slow 
steps and approaches the minimum with agonizing slowness. This phenomenon 
does not require extreme ill conditioning. Even if the system is modestly ill 
conditioned and well worth solving from the standpoint of numerical accuracy, 
the convergence can be very slow. 
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2.2 The Conjugate Gradient Method 

The steepest descent method discussed earlier is limited by its lack of memory. It 
only uses information about to get to and all information from earlier 

iterations is forgotten. The Conjugate Gradient method is a simple variation 
on steepest descent that performs better because it has a memory. Conjugate 
gradient updates the solution in each iteration by using 


^(k+i) ^ ^(k) ^ ^ 2 . 8 ) 

which is exactly identical to the steepest descent approach, Equation 2.3. This 
gives the following residual update 


y,(fc+i) _ ^{k) _ Q,j^j^p{k) ^2.9) 

Computation of a is organized a bit differently from steepest descent, but 
the difference is cosmetic. The choice of search directions, although inspired by 
the steepest descent, makes the whole difference. In conjugate gradient search 
direction is updated as 


p{k+l) _ ^ik+l) j^i^pik) ( 2 . 10 ) 

where is chosen such that is A— orthogonal (conjugate) to all the previous 
search directions and this condition gives the following expression for fik 


Pk = 


(j'(k) ^ j'ik)'^ 


( 2 . 11 ) 


In addition to the orthogonality, derivation of the above expression uses self- 
adjoint property of A with respect to the inner product, Axelsson [1994]. 

After j iterations of the form + a.kP^^\ both conjugate gradient 

and steepest descent algorithms give 


~ Q,'opi°^ -1- -h ... + 
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Thus lies ill the space Sj spanned by the j search directions 
This is true for both the algorithms. However, they pick different search di- 
rections. Interestingly, the two different sets of the search directions span the 
same space, Sj = span {b^Ab,A'^b,...,A^~^b}, a Krylov subspace. Since steepest 
descent does exact line searches, minimizes the function J over the j lines 
a;(*) -p Q/p(^‘)^ /j = — 1. The union of these lines is a tiny subset of Sj. 

But, because of the conjugate search directions, conjugate gradient manages to 
pick out the x^^'^ that minimizes J over the entire subspace Sj. This is called 
subspace minimization and therefore, conjugate gradient is also called a subspace 
minimization method. 

Standard conjugate gradient algorithm can now be summarized as follows 
(Barrett et al. [1993]). 

Algorithm 2.2: Standard Conjugate Gradient Algorithm 

1. Set k = 0. Choose an initial approximation xq. Compute = b — Ax^°'^ 
and set po = tq. 

2. Compute 

cik = {rk,rk)/{Pk,Apk) 

-t- OLkP^^'^ 

^(fe+1) _ _ (XkAp^^^ 

p{k+l) _ j.(k+l) _p Pi^pW 

3. Set k ~ kA-l 

4. Repeat step- (2) and step- (3) until convergence in x is reached. 

The method became popular due to many factors; 

1. It has an optimality property over the relevant solution space, which usually 
means convergence to an acceptable accuracy with far fewer steps than the 
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number required for the finite termination property— i.e., it has a relatively 
high rate of convergence. 

2. The rate of convergence can be much improved with various preconditioning 
techniques. 

3. The method is parameter free — i.e., the user is not required to estimate any 
adjustable parameters. 

4. The short recurrence relation makes the execution time per iteration and 
memory requirements acceptable. 

5. The round-off error properties are acceptable. 

2.3 Generalized Conjugate Gradient Method 

With standard conjugate gradient method for solving Ax = b, Algorithm 2.2, 
success is guaranteed only when A is symmetric and positive definite. After k 
iterations of this algorithm solution can be written in the following form 

k 

+ Y^aiA^-\AxQ - b) (2.12) 

i=l 

and the algorithm chooses cti’s to minimize Ija;*; — x\\, where || 2 || = {z^Az)'^^'^. 

The standard conjugate gradient algorithm is modified to a generalized con- 
jugate gradient algorithm that applies to any nonsingular coefficient matrix A. 
The trick used is to solve a modified system A^ Ax ~ A^h and since, in this modi- 
fied system coefficient matrix is symmetric, standard conjugate gradient becomes 
applicable. Applying the standard CG algorithm to this modified system is equiv- 
alent to expanding in powers of (A^A) instead of powers of A in the right hand 
side of the Equation 2.12 and this makes us able to minimize x^ - x in Euclidean 
norm, For the new generalized algorithm we have the following equation. 

k 

= a;o + ^ Q;i(A^A)*“^A^(Aa:o - b) (2.13) 

Generalized conjugate gradient algorithm can now be summarized as follows 
(Kershaw [1978]). 
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Algorithm 2.3: Generalized Conjugate Gradient Algorithm 

1. Set k — Q. Choose an initial approximation xq. Compute = b — 
and set po = 

2. Compute 

ak = {rk,rk)l{Pk,Pk) 

2;(fc+l) — ^(k) a^pi^) 

j,(/c+l) _ J.(.k) _ Q,f,^pik) 

Pf. = (^r^k+l) ^^(,k+l)'j ^ (^^{k) ^j.{k)'^ 
pik+l) _ j^T^ik+l) _j_ p^pik) 

3. Set k = k + 1 

4. Repeat step- (2) and step- (3) until convergence in x is reached, 

2.4 Preconditioned Conjugate Gradient Method 

Consider a linear system of equations, 

Ax = b (2-14) 

where A is any nonsingular sparse matrix. The fact that one can always do an 
incomplete lower-upper (LU) decomposition of the coefficient matrix A is the mo- 
tivation for the selection of incomplete Z,f7-factorization as preconditioner. Exact 
decomposition will be subject to fill-in and moreover, that is almost equivalent to 
solving the linear system itself. So, instead of doing complete L!7-decomposition, 
L is defined to have non-zero entries only within the semi bandwidth of A corre- 
sponding to the lower triangular portion of A and similarly U corresponding to 
the upper triangular portion of A. Thus the factorization is clearly approximate. 
The form of L[/-decomposition algorithm in which all the diagonal elements of L 
are 1 (Crout’s Decomposition) has been used. This incomplete L[/-decomposition 
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algorithm is summarized below. 

Algorithm 2.4: Grout’s L17-decQmpositiQn algorithm 

1. Choose element ajj row wi.se, within the band of A. 

2. If aij lies under the main diagonal, i.e., i > j 

j-i 

fc=l 

3. If Cij lies on the main diagonal, i.e., i—j 

i—\ 

h,i — ^ ] h,k'^k,i 

k=\ 

lli^i — 1 

4. If aij lies above the main diagonal, i.e., i < j 

i-l 

O-iJ — k,kUk,j 
k=l 

Thus one can obtain an approximate inverse {LU)~^ for A and then, can 
rewrite the original system, Equation 2.14, in the following form. 

L-^AU-\Ux) = L~^b (2.15) 

Now, the generalized conjugate gradient method, Algorithm 2.3, when applied to 
the above preconditioned system we get the following Preconditioned Conjugate 
Gradient Algorithm (Kershaw [1978]). 

Algorithm 2.5: Preconditioned Conjugate Gradient Algorithm 

1. Set A: = 0. Choose an initial approximation xq. Compute = b~ Ax^°'l 
and set po = A^ 
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2. Compute 

ak = {rk,{LL'^)-'n)/{p,,U'^Upk) 

3;(^+l) = 2;{^) ..j. 

j3k = (r^'^+i), (I,L^)-V(*^+i)>/(r(^),(LL^)-ir(*)) 

p(fc+i) ^ 

3. Set k — k + 1 

4. Repeat step— (2) and step- (3) until convergence in x is reached. 

This preconditioned conjugate gradient is general and works for the case of 
any nonsingular coefficient matrix. Nevertheless, in the cases where the coeffi- 
cient matrix is quite ill-conditioned there are chances of becoming zero during 
incomplete Lt/-decomposition and if so happen, algorithm will break down imme- 
diately. In such a case, one can simply set Ui^i to a nonzero value and go on with 
the algorithm. Although this will introduce some error but Xi7-decomposition is 
approximate anyway. 

2.5 Preconditioners 

As discussed earlier, the convergence rate of iterative methods depends on spectral 
properties of the coefficient matrix. Hence one may attempt to transform the 
linear system into one that is equivalent in the sense that it has the same solution, 
but has more favorable spectral properties. A preconditioner is a matrix that 
affects such a transformation (Barrett et al. [1993]). 

For instance, if M approximates the coefficient matrix A in some way, the 
transformed system 

M-\Ax = M-H (2.16) 

has the same solution as the original system Ax — 6, but the spectral properties 
of its coefficient matrix M~'^A may be more favorable. 
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In devising a preconditioner, one is forced with a choice between finding a 
matrix M that approximates A, and for which solving a linear system a easier 
than solving one with 4., or finding a matrix Af that approximates so that 
only multiplication by M is needed. The majority of available preconditioners 
fall in the first category. 

2.5.1 Cost trade-off 

Since using a preconditioner in an iterative method incurs some extra cost, both 
initially for the setup, and per iteration for applying it, there is a trade-off be- 
tween the cost of constructing and applying the preconditioner, and the gain 
in convergence rate. Certain preconditioners little or no construction phase at 
all (for instance the SSOR preconditioner), but for others, such as incomplete 
factorizations, there can be substantial work involved. 

On parallel machines there is a further trade-off between the efficacy of a 
preconditioner in the classical sense, and its parallel efficiency. Most of the tra- 
ditional preconditioners have a large sequential component. 

2.5.2 Left and Right Preconditioning 

The transformation of linear system A — >• M-iA is often not what is used in 
practice. For instance, the matrix M-iA is not symmetric, so, even if A and 
M are, the standard conjugate gradient method is immediately not applicable to 
this system. In this case, a way of deriving the preconditioned conjugate gradient 
method would be to split the preconditioner as Af = M 1 M 2 and to transform the 
system as 

Mr^AM2-\M2x) = Mr^b (2.17) 

If M is symmetric and Mi = it is obvious that now one has a method with 
a symmetric iteration matrix and hence, standard conjugate algorithm can be 
applied. 

There is one more approach to preconditioning, which is easier to derive. 
Again consider the system in Equation 2.17. Here Mi and M 2 are called the left- 
and right preconditioners, respectively, and one can simply apply an unprecondi- 
tioned iterative method to this system. Only two additional actions ro = Mi~^ro 
before the iterative process and a’„ = M 2 ~^Xn after are necessary. 
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2.5.3 Jacobi Preconditioning 

The simplest preconditioner consists of just the diagonal of the matrix: 

I 0 otherwise. 

This is known as point incomplete preconditioner. It is possible to use this pre- 
conditioner without using any extra storage beyond that of the matrix itself. 
However, division operations are usually quite costly, so in practice storage is 
allocated for the reciprocals of the matrix diagonal. 

Block Jacobi Methods 

Block versions of the Jacobi preconditioner are derived by partitioning of the 
variables. If the index set S = {1, ...,n} is partitioned as S' = IJ^S'i with the sets 
Si mutually disjoint, then 

_ r aij if i and j are in the same index subset 
\ 0 otherwise. 

The preconditioner is now a block-diagonal matrix. 

Often, natural choices for the partitioning suggest themselves: 

1. In problems with multiple physical variables per node, blocks can be formed 
by grouping the equations per node. 

2. In structured matrices, such as those from partial differential equations on 
regular grids, a partitioning can be caused on the physical domain. Exam- 
ples are a partitioning along lines in a 2D case, or planes in 3D case. 

3. On parallel computers it is natural to let the partitioning coincide with the 
division of variables over the processors. 

Jacobi preconditioners need very little storage, even in the block case, and they 
are easy to implement. Additionally, on parallel computers they do not present 
any particular problems. On the other hand, more sophisticated preconditioners 
usually yield a larger improvement. 
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2.5.4 SSOR Preconditioning 

The SSOR (Symmetric Successive Over-Relaxation) like the Jacobi precondi- 
tioner, can be derived from the coefficient matrix without any work. If the original 
symmetric matrix is decomposed as 

A = D + L + L^ 

in its diagonal, lower, and upper triangular part, the SSOR matrix is defined as 

M ^ {D + L)D-^{D + Lf, 

or, parameterized by oj 

The optimal value of the w parameter, like SOR method, will reduce the number 
of iterations to a lower order. Specifically, for second order elliptic problems a 
spectral condition number K{Mu,opt~^A) = 0 {^/k(A)) is attainable, Axelsson et 
al [1984]. In practice, however, the spectral information is needed to calculate 
optimal u). 

The SSOR matrix is given in factored form, so this preconditioner shares many 
properties of other factorization-based methods. For instance, its suitability for 
vector processors or parallel architectures depends strongly on the ordering of 
variables. On the other hand, since this factorization is given a priori, there is 
no possibility of breakdown as in incomplete factorization methods. 

2.5.5 Incomplete Factorization Preconditioners 

A broad class of preconditioners is based on incomplete factorizations of the co- 
efficient matrix. The factorization is called incomplete if during the factorization 
process certain fill-in elements, nonzero elements in the factorization in positions 
where the original matrix had a zero, have been ignored. Such a preconditioner 
is then given a factored form M = LU with L lower and U upper triangular. The 
efficacy of the preconditioner depends on how well approximates 

Unlike the Jacobi and SSOR preconditioners, incomplete factorization precon- 
ditioners need a non-triviai creation stage. Incomplete factorizations may break 
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down (attempted division by zero pivot) or result in indefinite matrices (negative 
pivots) even if the full factorization of the same matrix is guaranteed to exist and 
yield a positive definite matrix. 

An incomplete factorization is guaranteed to exist for many factorization 
strategies if the original matrix is an M-matrix (i.e., Mij < 0 for all i ^ j). This 
was originally proved by Meijerink and Van der Vost [1981]. In cases where the 
pivots are zero or negative, strategies have been proposed such as substituting 
an arbitrary positive number (Kershaw [1978]), or restarting the factorization on 
A + al for some positive value of a (Manteuffel [1980]). 

An important consideration for incomplete factorization preconditioners is the 
cost of the factorization process. Even if the incomplete factorization exists, the 
number of operations involved in creating it is at least as much as for solving 
a system with such a coefficient matrix, so the cost may equal that of one or 
more iterations of the iterative method. On parallel computers this problem is 
aggravated by the general poor parallel efficiency of the factorization. 

Point Incomplete Factorizations 

The most common type of incomplete factorization is based on taking on a set S 
of matrix positions, and keeping all positions outside this set equal to zero during 
factorization. The resulting factorization is incomplete in the sense that fill is 
suppressed. 

The set S is usually chosen to encompass all positions {ij) for which ai^ ^ 0. 
A position that is zero in A but not so in an exact factorization is called a fill 
position, and if it is outside S, the fill there is said to be discarded. Often S is 
chosen to coincide with the set of nonzero positions in A, discarding all fill. There 
are two major strategies for accepting or discarding fill-in, one structural and one 
numerical. The structural strategy is that of accepting fill-in only to a certain 
level. The numerical fill strategy is that of drop tolerances: fill is ignored if it is 
too small, for a suitable definition of small. Although this definition makes more 
sense mathematically, it is harder to implement in practice, since the amount of 
storage needed for the factorization is not easy to predict. 
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Block Incomplete Factorizations 

The starting point for an incomplete block factorization is a partitioning of the 
matrix. Then an incomplete factorization is performed using the matrix blocks 
as basic entities. The most important difference with the point methods arises 
in the inversion of the pivot blocks. Whereas inverting a scaler is easily done, 
in block case two problems arise. First, inverting the pivot block is likely to 
be a costly operation. Second, initially the diagonal blocks of the matrix are 
likely to be sparse and one would like to maintain this structure throughout the 
factorization. Hence the need for approximation of inverses arises. 

In block factorization a pivot block is generally forced to be sparse, typically 
of banded form, and that we need an approximation to its inverse that has similar 
structure. Further this approximation should be easily computable, so the option 
of calculating the full inverse is generally ruled out by taking the banded part of 
it. For example, the simplest approximation to A~^ is the diagonal matrix D of 
the reciprocals of the diagonal of A. 

Banded approximations to the inverse of banded matrices have a theoretical 
justification. In the context of partial differential equations the diagonal blocks 
of the coefficient matrix are usually diagonally dominant. For such matrices, 
the elements of the inverse have a size that is exponentially decreasing in their 
distance from the main diagonal. 

2.5.6 Polynomial Preconditioners 

Polynomial preconditioners can be considered as the second class of precondition- 
ers where the preconditioning matrix is a direct approximation of the inverse of 
the coefficient matrix. Suppose that the coefficient matrix A of the linear system 
is normalized to the form A = I — B, and that the spectral radius of B is less 
than one. Then using the Neumann series, one can write the inverse of A as 
A~^ = thus, an approximation may be derived by truncating this 

infinite series. 

Dubois et al. [1979] investigated the relationship between a basic method 
using a splitting A ~ M — N , and a polynomially preconditioned method with 
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The basic result id that for classical methods, k steps of the polynomially pre- 
conditioned method are exactly equivalent to kp steps of the original method. 
For accelerated methods, especially the Chebyshev method, the preconditioned 
iteration can improve the number of iterations by at most a factor of p. 

Although there is no gain in the number of times the coefficient matrix is 
applied, polynomial preconditioning does eliminate a large fraction of the inner 
products and update operations and there may be an overall increase in efficiency. 



Chapter 3 


Numerical Experiments with 
PCG 


In this chapter, numerical experiments with the Preconditioned Conjugate Gra- 
dient algorithm have been presented. A nonlinear unsteady heat conduction 
problem is first considered and a study of variation in CPU time with the size 
of the problem is done for three linear system solvers, namely Gaussian Elimi- 
nation, Gauss-Seidel and Preconditioned Conjugate Gradient. In the second test 
case, the suitability of Preconditioned Conjugate Gradient for solving advection- 
diffusion equation is assessed. To study the effect of preconditioning, eigenvalues 
and condition numbers of the original and preconditioned matrices have been 
studied for the advection-diffusion problem. 

3.1 Heat Conduction Problem 

Here the following dimensionless equation governing unsteady heat conduction in 
a two dimensional variable conductivity region (Figure 3.1) is solved. 


d 





dB 

dr 


Thermal conductivity in this region varies in the following manner: 


(3,1) 


k(s) = i-pe 


(3.2) 


and the values of P that has been considered are /? = 0 and 0.4. 



3.1 Heat Conduction Problem 


31 


« = 1 



Figure 3.1: Ph 3 'sical and Computational domains for the nonlinear conduction 
problem. 

Physical and computational domains along with initial and boundary condi- 
tions are shown in Figure 3.1. Governing Equation 3.1 is discretized using pure 
implicit scheme for the unsteady term and second order central difference for all 
other terms. For a point {i,j) in the computational domain this discretization 
gives following relation for temperatures between two consecutive time steps; 




{ ju(P+l) 


l{p+1) 

Sj-1 

lCp+i) 




+ (1 + i6sA<y>) = eg 


Where s = and h{= AX = Ay) is the discretization in both X and Y 
directions. 


3.1.1 Solution Procedure 

The solution algorithm has been used to obtain the time marching temperature 
field inside the computational domain. 

1. Set time step p^O and initialize the temperature field at the time step, 
with a given initial temperature distribution. 
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2. Use the following steps to calculate temperature at (p + time step, 
by solving the nonlinear system of equations represented by the dis- 
cretized governing Equation 3.4. 

(a) Initialize = 0 (p) 

(b) Evaluate k values in Equation 3.4 using — k and 

thus generate a linear system of equations in 

(c) Solve the system of linear equations for using a linear system 
solver 

(d) Repeat steps (b) and (c) until convergence in k and is achieved 


3. Set and p — p + 1. 

4. Repeat step-2 and step-3 until steady state is reached. 


3.1.2 Results and Discussion 

The results reported for the nonlinear conduction problem includes 

1. Comparison of CPU time for solving the problem till steady-state for three 
solvers namely Gaussian elimination, Gauss-Seidel iterations and precondi- 
tioned conjugate gradient method. 

2. Temperature contours obtained using the above linear system solvers. 


Results have been presented for two variations in thermal conductivity, he., 
/5 = 0 and 0.4. During nonlinearity iterations {1) the convergence criteria used is 



X 100 < 0.01 


for all grid points (i) in the computational domain. 

CPU time variation shown in Figure 3.2 clearly shows that PCG is much faster 
then Gaussian elimination and Gauss Seidel. For example, for /3 = 0.4 variation 
in thermal conductivity, problem solving with PCG is 10 — 20 times cheaper {in 
terms of CPU time) than Gauss-Seidel and almost 1900 times cheaper than Gaus- 
sian elimination. Moreover, it has been generally observed that PCG becomes a 
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more attractive choice as the number of grid points in the computational domain 
increases. 

Figure 3.3 shows temperature contours inside the 2D domain for a linear 
(/3 = 0) and nonlinear (/3 = 0.4) case. In both the cases, temperature fields 
obtained using three different solvers are identical. This verifies the uniqueness 
of inverses for the matrices arising in this problem with respect to different matrix 
inverters used. 
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Figure 3.2: Variation in GPU time with number of grid points in X and Y 
directions. Thermal conductivity varies with temperature as A: = 1 - ^50. 



Figure 3.3: Temperature contours for linear (/3 = 0) and non linear = 0.4) 
conduction using three different matrix inverters. Thermal conductivity varies 
with temperature as A: = 1 — PO. 
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3.2 Advection-DifFusion Problem 

Here the numerical solution of the energy equation (advection-diffusion form) for 
laminar flow between two infinite parallel plates has been considered as the test 
problem. Flow is assumed to be hydrodynamically fully developed. Temperature 
of both the plates is kept constant at T, = 0 for 2 : < 0, and at To = 1 for a; > 0. 
Incoming fluid also has a temperature T. Physical domain for the problem is 
shown in the Figure 3.4. 



Figure 3.4: Laminar, hydro-dynamically fully developed flow between two infinite 
parallel plates with constant wall temperature conditions. 

The velocity profile for this flow is parabolic and is given by the following 
equation. 

3 


Now with the assumptions : 

1. constant thermal conductivity of fluid; 

2. negligible viscous dissipation; 

3. negligible compressibility effects. 

the unsteady energy equation for this flow is els follows (Bejan [1984]). 


D/2 


(3.4) 
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dT dT (d'^T d^T\ 


(3.6) 


3.2.1 Nondimensionalization 

Introducing the following dimensionless quantities: 


9 = 


T-Ti 

To-Td 


X = 


X 

'W/2' 



_ t 
u 


and 


u 


u 

U 


where 

Tj : incoming fluid temperature 
To : wall temperature 
D : distance between two plates 
U : average axial velocity. 

The governing Equations 3.4 and 3.5 takes the following form. 


u 


* 




(3.6) 


dr ax Pe \ax^ BY^ j ' ' 

where 

Pe = ^^. 

a 

3.2.2 Initial and Boundary Conditions 

The physical domain, Figure 3.4 has geometric, hydrodynamic and thermal sym- 
metry about the mid plane (the plane that contains .t— axis and is parallel to 
plates). Therefore, the portion above this plane is considered as computational 
domain. 

Now for this computational domain the non-dimensional initial and boundary 
conditions are as follows. 
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Initial condition: 

at r = 0, for all A" and Y 

6 = 0 

Boundary conditions: 
for r > 0, 


at the inlet of the computational domain (A = 0 and 0 < 1" < 1) 


6 = 0 


at the outlet of the computational domain (A^ 1 and 0 <Y < 1) 

dd 


dX 


0 


at the mid plane (A'’ > 0 and K = 0) 


^ = 0 
dY 


at the upper wall (A > 0 and y = 1) 


6 = 1 


The computational domain along with dimensionless boundary conditions is shown 
below. 


Y = 1 


0 = 1 


I 


-X 



x = o 


» 1 


Figure 3.5: Computational domain for the advection-diffusion problem. 
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3.2.3 Discretization 

Governing nondimensional equation, Equation 3.7, is discretized in the following 
manner. 

Unsteady term : Pure implicit 

Advection term : QUICK (In view of minimizing error due to false diffusion.) 
Conduction terms : Second order central difference 


NW 


N 


ww< 


w 


P I e 


sw 


Figure 3.6; Computational molecule for discretization. 


For this discretization, the present time-step is p and the next time-step is taken 
as p -I- 1. A computational molecule for discretization is shown below in Figure 
3.6. The unsteady term at time pd- 1, for the grid point P, can be written as : 


\dT ) p Ar 

The advection term at point P and time p -f 1 can be written as follows. 


d{u*e) 

dX 


1 p+i 


{u*ey+^ - {u*ey+^ 


j p 


AX 


(3.9) 


Using QUICK, at any time value of advection flux from the face e of the control 
volume surrounding P can be written as. 


(u-9). = - lcURVN,(AXf + ^CURVT,(AY) 


where 
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for u* > 0 


= 2 [(“'Op + {«'9)p1 


CURVN = K0p:-2(«'^)p + K0 

* (AX)2 


w 


and 


CURVT — ^ ^)p ('^ ^)s 

' (Ay)2 


Similarly it can be written for the face w 


(u-al = - ^C[/RVN^(AXf + ^CURVTJAV)' 

where 




CURVK,, = 


{u*e)p-2{u*9)^ + {u*e) 


ww 


{/\Xf 


and 


CURVT^ = 


{U*d)]Y N 2{u 9)y^, {u 9)y^p 


(Ay)2 

Substituting the above values of {u*9)g and {u*9)^ in Equation 3,9 gives 


(AA) 


d{u*9) 

dX 


p+i 


[(«•»)'« - («•»)«'] 


J p 




__1 
"s 

[(“•«)«' - + K«)f ‘ - (“'«)K + 2(«‘od‘ - 

(3,10) 
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Discretization for conduction terms can be written as 

( ^ a- , C - 2^^+^ + \ ^ ^ ^ 

\dx^ dry p [ (ax) 2 (Ay)2 ) '' 

Substituting the discretized values of various terms from Equations 3.8, 3.10 and 
3.11 in the governing equation 3.7, the following algebraic equation is obtained. 


(3u^Pe,) - (IQu^Pe, + 24) + [r + 7n^Pe, + 48 (1 + 77=)] 

- (24 - 9M;,Pee) - (247?2 _ u;^,Pe^) - (24]?^ - n^PeJ 

- (n;,Pe,) Cw - (wsPe.) 0^sw = rB^p (3-12) 


Here Pec is cell Peclet number, defined as Pec = Pe(AX), R — ^ and 

24(AX)Pec 

r = . 

At 

3.2.4 Nusselt Number Calculation 


To address the basic question of heat transfer for this flow, let us consider a 
relationship between the heat flux and wall-fluid temperature difference. Since 
fluid temperature varies over the cross-section, AT = To — T„^ is conventionally 
selected as the the wall-fluid temperature difference. We are then interested in 
the heat transfer coefficient 


h = 


( § 1 . 
\dy 


wall 


To — T„ 


(3.13) 


Using the above heat transfer coefficient, Nusselt number can be defined as 


Nu = ^ (3.14) 

k 

where Dh is the hydraulic diameter (D/, = 2D in this case). The value of mean 
temperature, T^ in Equation 3.13, is given by 

T = 

-^m — 


I-D/2 (“P) <^y 


DU 


(3.15) 
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Introducing the previously used dimensionless quantities and using Equations 
3.13, 3.14 and 3.15, expressions for Nusselt number are given by the following 
two equations: 


Nu = (3.16) 

1 Um 

and 

= l{l-Y^)e]dy (3.17) 

3.2.5 Solution Procedure 

Here Equation 3.12 represents the discretized governing equation and the initial 
and boundary conditions are as given in Section 3.2.2. Starting with a given initial 
condition for temperature, a marching solution in time for the entire domain has 
been found. It is clear from Equation 3.12, that a linear system of equations is 
to be solved in each time-step to proceed in this way. Preconditioned Conjugate 
Gradient (PCG) algorithm is used to solve this linear system at each time step. In 
this manner the temperature field is obtained as a function of time until steady- 
state is reached. 

The Nusselt number is evaluated using Equations 3.16 and 3.17. Simpson’s 
1/3 rule has been used to calculate the bulk mean temperature given by equation 
3.17. 

3.2.6 Results and Discussion 

The following results have been presented for this analysis: 

1. Variation of Nusselt number with time at selected locations on the heated 
plate (Figure 3.7) 

2. Steady-state variation of Nusselt number along the heated plate (Figure 
3.8) 

Both the results have been presented for six values of the Peclet number, i.e., 0.1, 
1, 10, 100, 500 and 1000. Figure 3.7 shows the variation of Nusselt number with 
time at selected X-locations along the plate for six different values of Peclet 
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number. Here, at any A'"— location Nusselt number decreases exponentially with 
time and then becomes constant at a steady-state v^alue. This steadj^-state value 
decreases with the distance along the heated plate. This is expected since along 
the plate as the fluid gets heated, rate of plate heat transfer and therefore the 
Nusselt number decreases. 

In Figure 3.8 both the plots depict the variation in steady-state Nusselt num- 
ber with distance along the heated plates. The value of Nusselt number first 
decreases exponentially with distance and then becomes constant at a larger dis- 
tance. At this distance the flow is said to be thermally developed. The value 
of Nusselt number for thermally developed flows has been reported in literature. 
For the considered problem, analytic solutions in literature reports Nu = 7.54. 
It has been observed that, when the flow is thermally developed, numerically 
calculated Nusselt number closely matches this value. Moreover, at higher values 
of Peclet number it exactly matches the reported value. An explanation for this 
fact is that, for this case, the analytic solution of energy equation reported in the 
literature is valid only for large values of Peclet number (Pe >> 1). 

In Figure 3.8, it is to be noted that in each plot, the upper two X — Y curves 
are slightly shifted in vertically upward direction and for these two, the values 
labeled on the F— axis are slightly inconsistent The labels on F-axis are exactly 
consistent with the bottom-most curve. Labels on the A— axis are consistent 
with all the curves in all the plots. 

For solving the linear system of equations, three solvers namely. Precondi- 
tioned Conjugate Gradient (PCG), point- by-point Gauss-Seidel and Gaussian 
elimination have been used. Preconditioned Conjugate Gradient (PCG) has been 
applied in two different ways. In the first, incomplete lower-upper (ILU) decompo- 
sition is done in each time-step when the solver is used. In the second, incomplete 
lower-upper (ILU) decomposition is done only once when the solver is used for 
the very first time in the computer program. For this problem. Point by Point 
Gauss-Seidel emerges as the fastest solver in terms of CPU time. Performance of 
Preconditioned Conjugate Gradient (PCG), when implemented with incomplete 
LU decomposition in every time step is almost ten times slower than point-by- 
point Gauss-Seidel. When it is implemented only once, it is almost four times 
slower than point-by-point Gauss-Seidel. Performance of Gaussian elimination is 
awfully slow [i.e., almost more than hundred times slower than point-by-Point 
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Gauss-Seidel). Moreover, since it is a linear problem the effectiveness of PCG is 
not clearly seen. The CPU time comparision is expected to show in favourable 
light for nonlinear problems. 
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Figure 3.7: Variation of Nusselt number with time at selected A— locations, for 
different values of Peclet number. Here X is the distance along the heated wall. 
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3.3 Assessment of Eigenvalues and Condition 
Numbers 

3.3.1 Eigenvalues 

As discussed earlier in Section 2.5, the spread of eigenvalues of the coefficient ma- 
trix is a qualitative measure of the intrinsic difficulty in solving a linear system. 
Thus the eigenvalue spectrum provides a qualitative basis for comparing differ- 
ent coefficient matrices. Moreover, the effect of preconditioning can be seen by 
comparing the eigenvalue spectra of the preconditioned matrix with the original 
coefficient matrix. 

3.3.2 Condition Numbers 

Another important issue in solving linear system Ax ~ h \5 the sensitivity of 
the solution for small perturbations in b and condition number of the coefficient 
matrix A is a measure of this sensitivity. 

Consider the linear system Ax = b, where A is nonsingular and b is nonzero. 
There is a unique solution x to this system which is also nonzero. Now suppose a 
small vector 6b is added to b and consider a perturbed system Ax = b + 6b. This 
system also has a unique solution x, and let 6x denote the difference between x 
and X, so that x = x + Sx. 

Now it can be easily shown that 

(3.18) 

Above equation provides a bound for in terms of The factor 

||A||||A"^|| is called the condition number oi A and is denoted as k(A). 

a(A) = ||A||||A-^|| (3.19) 

If the coefficient matrix is symmetric and positive definite (that is, all it’s 
eigenvalues are positive) above definition of condition number becomes 

k(A) = ^ (3.20) 

But in general (for any nonsingular coefficient matrix A) some of the eigen- 
values will be complex and for that case, the following definition of condition 
number has been used. 
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ac(j 4) = max |Ai|/ min |Ai| (3.21) 

i i 

Thus if k(A) is not too large, then small values of ||^Zj||/ 1|6|| imply small values 
of ||5a;||/||xl|. That is, the system is not overly sensitive to perturbations in b. 
Thus if k.(A) is not too large, one can say that A is well conditioned. If, on the 
other hand, k,{A) is large, a small value of |156||/||&|| does not guarantee that 
||(5.T||/||rc|| will be small. Since, in this case, there definitely are choices of b and 
6b for which resulting ||(lx||/||a:|| is much larger than the resulting 1155j|/||6l|. In 
other words, the system is potentially very sensitive to perturbations in b. Thus 
if k{A) is large, A is said to be ill conditioned. 

The best possible condition number is 1. Of course, the condition number 
(definition given by Equation 3.19) depends on the definition of norm, and a 
given matrix have a large condition number with respect to one norm and small 
condition number with respect to another. As such, there is no cutoff between 
the ill-conditioned and well-conditioned matrices. Any such (fuzzy) boundary 
depends upon a number of factors, including the accuracy of data used, the 
precision of floating point numbers and the amount of error one is willing to 
tolerate in the computed solution. 

3.3.3 Computation of Eigenvalues using MATLAB 

Here, MATLAB function eig has been used for estimating eigenvalues. This 
function produces a vector e containing the eigenvalues of A when used in the 
following format. 

e = eig(A) 

For real matrices, eig(A) uses the EISPACK routines BALANC, BALBAK, 
ORTHES, and HQR2. BALANC and BALBAK balance the input matrix. OR- 
THES converts a real general matrix to Hessenberg by using orthogonal similarity 
transformations. Then, HQR2 finds the eigenvalues of a real upper Hessenberg 
matrix by the QR algorithm. Thus the whole process can be summarized in 
following three points. 


1. Balancing of the input matrix 
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2. Reduction of the balanced matrix to upper Hessenberg form 

3. Calculation of eigenvalues of upper Hessenberg matrix using QR algorithm 
now these points can be further described as follows. 

Balancing 

The sensitivity of eigenvalues to rounding errors during the execution of the algo- 
rithm can be reduced by the procedure of balancing. The errors in the eigensystem 
found by the numerical procedure are generally proportional to the the Euclidean 
norm of the matrix, that is, to the square root of the sum of the squares of the 
elements. The idea of balancing is to use similarity transformations by diagonal 
matrices to make rows and columns of the matrix have comparable norms, thus 
reducing the overall norm of the matrix while leaving the eigenvalues unchanged. 
A symmetric matrix is already balanced. 

The cost of balancing is flop counts, where n is the order of the input 
matrix and the time taken by the subroutine BALANC is never more than a 
few percent of the total time required to find the eigenvalues. It is therefore 
recommended that you always balance nonsymmetric matrices. It never hurts, 
and it can substantially improve the accuracy of the eigenvalues computed for a 
badly balanced matrix. 

Reduction to Hessenberg form 

To understand the need for reduction to Hessenberg form, first one need to learn 
something about the QR algorithm. The QR algorithm is an iterative process for 
finding eigenvalues. It is based on QR decomposition, which is a direct procedure 
related to Gram-Schmidt process. A single iteration of the QR algorithm is called 
a QR step or QR iteration. In a QR step for a general matrix, the cost of QR 
factorization is ^ and then the cost of matrix multiplication RiQi is 2n^. Thus 
the total cost per QR step for a general matrix is Reduction to upper 

Hessenberg form is equally expensive and requires but it has to be done 
once, because upper Hessenberg form is preserved by the QR algorithm. Now 
since QR iterations with the upper Hessenberg matrix are relatively inexpensive 
and each can be done costing only 0{n^) flops, QR algorithm is applied only after 
converting the input matrix to Hessenberg form. 
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Algorithm for Householder reduction to Hessenberg form is as follows (Watkins 

[ 2002 ]). 

Algorithm 3.1: Householder Reduction to Hessenberg form 
for /c = 1 to n — 2 

Vk = sign(a:)l|a;||2ei d-r 
Vk=Vk/\\Vk\\2 

end 

Here sign is a function such that sign(x) returns an array y of the same size 
as X, where each element of y is: 

1 if the corresponding element of x is greater than zero 
0 if the corresponding element of x equals zero 
-1 if the corresponding element of x is less than zero 
and ei = (1, 0, 0). 

QR algorithm 

As already described, QR algorithm is an iterative process for finding eigenvalues. 
It is based on QR decomposition, which is a direct procedure related to Gram- 
Schmidt process. During iterations Hessenberg form of the matrix is preserved 
and the diagonal elements finally converge to eigenvalues. QR algorithm can be 
summarized as follows (Watkins [2002]). 

Algorithm 3.2: QR algorithm 

If A° is the Hessenberg reduction of A, then 
Set A = {Q°fA°Q° 
for k — 1, 2... 

Pick a shift e.g., choose = A^",^ 

Q^R^^ _ ^k-\ __ ^kj Qj^ factorization of A^~^ - 

Ak — R'^Q'^ + /// Recombine factors in reverse order 

end 


^k+l:m,k:m) 

2(Ai;,n,fc+l:m^fc)‘^/; 
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3.3.4 Test cases for the Computation of Eigenvalues and 
Condition Numbers 

For the computation of eigenvalues and condition numbers, following two cases 
of the advection-diffusion problem have been considered. 

Unsteady Nonlinear Advection-Diflfusion 
Governing equation for this case is given as 


do d(u*o) 1 r d de\ d bb \ 

Bt^ BX ~ Pe [dX \BX j ^ BY \BY) 
u* is assumed to be a parabolic flow profile given by 


(3.22) 


and 


(3.23) 


k = l- {0.4)6 


(3.24) 


Steady Linear Advection-DifFusion 

Governing equation for the steady and linear is a special case of Equation 3.22. 


B{u*e) 1 f a'^9 B‘^9\ 

BX ~ Pe + gy 2 J 


(3.25) 


Discretization Formulation 

Governing Equations 3.22 and 3.25 has been discretized in the same way as in 
Section 3.2.3. Moreover, the initial and boundary conditions used are also exactly 
the same as given in Section 3.2.2. 

This gives the following discretized equation for the unsteady-nonlinear case. 

FiTprr. .;r --FTST^ 

Wo A 
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(SupPCj.) ^lOtipPec + 6 (^k\v + Akp — 


gp+i 


+ TupPec + A8kp (l + 


+ 


( QR f kj\f + 4/cp — ks 


u 




- 

Pe.l 


6 ( kj? + 4k p — k 




QupPe, 


e’t 


K( 


ks "f Akp — k 


(uyyPec) (u^Pec) 05 ^/ — rO^p 




uiPeJ 0^+^ 


(3.26) 


where Pec is cell Peclet number, defined as Pec = Pe(AA''), R = (AX) / {IS.Y) and 

_ 24(AX)Pec 
At 

For the steady-linear case it becomes. 


(3u^Pec) dww ~ (19u*pPec + 24} 0^ -j- [Tu^^Pbc + 48 (l -t- i?^)] 0p 

- (24 - 9ti),Pec) Be - {2AR^ - <Pec) 9^ - (2AR'^ - u^Pcc) Os 

- «Pec) 9nw - {u*s?ec) 9sw = 0 (3.27) 

Procedure for Calculation 

For the steady-linear case the coefficient matrix is generated using the governing 
discretized Equation 3.27. In the unsteady-nonlinear case the very first matrix 
arising just after 10 time steps is considered where the time step is taken as 0.005. 
In both the cases, length of the computational domain is taken 50 and the original 
coefficient matrix has the structure shown below in Figure 3.9. 

Then in both the cases preconditioned matrices are obtained from the left-right 
incomplete lower-upper (ILU) preconditioning of the original coefficient matrix. 
Preconditioned matrix is given by 

P = L-^AU-^ (3.28) 

Where L and U are the lower and upper triangular matrices obtained from LU 
decomposition of the original coefficient matrix A. LU decomposition is incom- 
plete in the sense that complete band of the original coefficient matrix is not used. 
Infact, all the possible bands (semi-bandwidths shown in Figure 3.9 by arrows) 
are have been tested for this type of incomplete LU decomposition. 



3.3 Assessment of Eigenvalues and Condition Numbers 


53 



Figure 3.9: Original coefficient matrix 


Finally, these matrices are supplied to MATLAB and it calculates all the 
eigenvalues as discussed in Section 3.3.3. 

3.3.5 Results and Discussion 

Following results have been presented for this analysis: 

1. Variation of condition number with Peclet number for three different grid 
sizes, in the steady-linear case (Figure 3.10). 

2. Variation of condition number with Peclet number for three different grid 
sizes, for the unsteady-nonlinear case (Figure 3.11). 

3. Eigenvalue spectra of the original matrices in steady-linear problem for 
different values of Peclet number (Figure 3.12). 

4. Eigenvalue spectra of the original matrices in unsteady-nonlinear problem 
for different values of Peclet number (Figure 3.13). 

5. Eigenvalue spectra along with condition numbers for the original and pre- 
conditioned matrices, for the steady-linear case (Figures 3.14-3.28). 

6. Eigenvalue spectra along with condition numbers for the original and pre- 
conditioned matrices, for the unsteady-nonlinear case (Figures 3.29-3.43). 

Eigenvalue spectra shown in Figures 3.12 and 3.13 are for a 101 x 11 grid. 
All the other eigenvalue spectra, listed above in 5 and 6, have been presented for 
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Figure 3!l0: Variation of condition number with Peclet number for three different 
grid sizes in steady-linear case 



Figure 3.11: Variation of condition number with Peclet number for three different 
grid sizes in unsteady-nonlinear case 
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Figure 3.12; Eigenvalue spectra of the original matrices in steady-linear problem 
for different values of Peciet number. 


Pe=!000 



Figure 3.13: Eigenvalue spectra of the original matrices in unsteady-nonlinear 
problem for different values of Peciet number. 




Figure 3.14: Eigenvalue spectrum and condition numbers for Pe = 0.10, using a 
101 X 11 grid in the steady-linear case. 
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Figure 3.18: Eigenvalue spectrum and condition numbers for Pe = 1000, using a 
101 X 11 grid in the steady-linear case. 
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Figure 3.19: Eigenvalue spectrum and condition numbers for Pe = 0.10, using a 
125 X 11 grid in the steady-linear case. 
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Figure 3.21: Eigenvalue spectrum and condition numbers for Pe 
125 X 11 grid in the steady-linear case. 


10, using a 















































3.3 Assessment of Eigenvalues and Condition Numbers 


68 























0 
















































101 X 11 grid in the unsteady-nonlinear case. 
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Figure 3.34: Eigenvalue spectrum and condition numbers for Pe = 0.10, using a 
125 X 11 grid in the unsteady-nonlinear case. 
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Figure 3.35: Eigenvalue spectrum and condition numbers for Pe = 1, using a 
125 X 11 grid in the unsteady-nonlinear case. 
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Figure 3.37: Eigenvalue spectrum and condition numbers for Pe = 100, using a 
125 X 11 grid in the unsteady-nonlinear case. 





















Figure 3.39: Eigenvalue spectrum and condition numbers for Pe = 0.10, using a 
151 X 15 grid in the unsteady- nonlinear case. 
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Figure 3.42: Eigenvalue spectrum and condition numbers for Pe = 100, using a 
151 X 15 grid in the unsteady-nonlinear case. 
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Figure 3.43: Eigenvalue spectrum and condition numbers for Pe = 1000, using a 
151 X 15 grid in the unsteady-nonlinear case. 
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Fzom the results presented in this study, the following observations can be 
recorded: 

1. As shown in Figure 3.10, condition-number for the original matrix in the 
steady-linear problem, initially decreases with an increase in Peclet number 
(for Peclet number 0.10 to 100). Later, it starts increasing with an in- 
crease in Peclet number (for Peclet number > 500). In between, for Peclet 
number 100 to 500, a transition from decreasing to an increasing condition- 
number trend takes place. Condition-number for the original matrix in the 
unsteady-nonlinear problem (Figure 3.11), decreases almost asymptotically 
with Peclet number and then, finally becomes constant with a magnitude 
one. 

Here, the mathematical nature of the partial differential equation being 
solved depends on Peclet number. As the Peclet number increases the 
problem changes from elliptic to hyperbolic. Since we are solving it as 
an elliptic problem for all values of Peclet number, obtaining the solution 
becomes difficult for higher values of Peclet numbers where the problem is 
actually hyperbolic. Rate of convergence in our solution procedure validates 
this point. 

Condition numbers represent the mathematical complexity in solving the 
problem. In Figure 3.10 and Figure 3.11, when the condition is decreasing 
with increasing Peclet number, mathematically the problem is changing 
from an elliptic to relatively simple hyperbolic problem. But of course, as 
observed from the convergence experience, practically solving the problem 
is becoming increasingly difficult with increasing Peclet number. 

2. Eigenvalue spectra shown in Figure 3.12 and Figure 3.12 clearly show an 
increase in the scatter of eigenvalues with increase in Peclet number. There- 
fore, the relatively poor convergence of iterative methods for higher values 
of Peclet numbers is explained. 

3. In both the problems considered, condition number increases with a de- 
crease in grid spacing for a given value of Peclet number (Figure 3.10 and 
Figure 3.11). This effect is relatively less prominent at greater values of 
Peclet number especially for the unsteady-nonlinear problem. It has been 
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fuithei obseived that, although the eigenvalue spectrum for these matrices 
remains appioximately same on finer grids, a few eigenvalues move closer 
and closer to zero. This effect on finer grids results in an increased condition- 
number and gives rise to poor convergence of iterative solvers. 

4. Effect of /Lf/-preconditioning is well seen for both the problems consid- 
ered, Figures 3.14 to 3.28 for the steady-linear problem and from Figures 
3.29 to 3.43 for the unsteadj'-nonlinear problem. As the width of the band 
considered during LU decomposition is increased, the eigenvalue spectrum 
improves (he., the spread of eigenvalues decreases) and condition number 
also decreases. This observation can be explaiired from the fact that as 
the number no diagonals considered during LU decomposition increases, 
error introduced during LU decomposition decreases and the decomposi- 
tion becomes more and more exact. Therefore, the preconditioned matrix 
L~^AU~'^ becomes more closer to the identity matrix I. 



Chapter 4 


Mathematical Formulation for 
Modeling Transport Phenomena 
in Czochralski Crystal Growth 
Processes 


Czochralski crystal growth is a physically complex and mathematically rich pro- 
cess. The quality of grown crystals depends on many macroscopic and microscopic 
physical effects (Figure 4.1). Formulating a mathematical model to incorporate 
all these effects is not an easy task and judicious choices need to be made in this 
regard. Mathematical model presented here allows the consideration of as many 
effects as the problem demands and is believed crucial for the modeling process. 
This model draws heavily from the earlier works of Zhang et al. [1995], Prasad 
et al. [1997] and Zou et al. [1997] and ongoing Ph.D. thesis of Banerjee [2004]. 

Figure 4.2 shows a schematic of the right-half of the considered axisymmetric 
Cz-system. The domain consist of various materials in different phases with 
significantly different thermophysical and transport properties (Table 4.4). The 
Cz-model presented here focuses on the growth of Nd:YAG (Neodymium doped 
Yttrium Aluminum Garnet) single crystals. Nd:YAG falls in the category of 
oxide crystals, and since, the melt phase for the growth of these crystals does not 
contain any volatile component, need for an encapsulant layer over the melt gets 
eliminated. 

An introduction to the basic growth process and various transport mechanisms 
that take place inside a Cz-system is given in Section 1.2 of Chapter 1. A list of 
various important process parameters is given in Table 4.3. 
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Figure 4.1: Transport mechanism and inhomogeneities associated with crystal 
growth processes. 

4.1 Equations governing flow, heat transfer and 
solute transport 

The assumptions that go into developing the mathematical model for flow, heat 
transfer and solute transport inside the considered axisymmetric system are as 
follows: 

1. The fluid flow is two-dimensional, laminar and incompressible. 

2. The fluids are Newtonian. 

3. Thermophysical properties are constant and uniform in various phases. 

4. Boussinesq approximation for buoyancy-driven convection is applicable. 

5. Viscous dissipation is negligible. 

6. Free surface deformation is negligible, i.e., free surface is assumed to be 


flat. 
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7. Change in surface tension of the liquid phase with temperature is negligible, 
i.e., Maragoni convection is absent. 

8. Liquid phase is treated as a dilute solution of the solute in the melt. 

9. Segregation coefficient is independent of the local growth rate of the crystal 
and other operating conditions. 

10. No diffusion takes place in the solid phase. 



Figure 4.2: Schematic of the right-half of the axisymmetric Cz-system considered, 
showing various zones, interfaces and the free surface. 

The scales that are used to nondimensionalize the governing equations are 
given in Table 4.1. Where refers to the dimensionless thermophysical proper- 
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Table 4.1; Scales for Nondimensionalization. 


Variable 

Nondimensionalization 

length 

(.T,y) = {x*,y*)/L 

time 

t = ri{L/Uc) 

velocity 

{u,v) = {u*,v*)/Uo 

temperature 

e={T- T,)l(Tu - T,) 

pressure 

p^(p* - pgx* - Pa)/{pUo') 

concentration 

c = c-ic-„, 

thermophysical properties 



4.2 Generalized conservative form of the trans- 
port equations 

The governing equations presented in Section 4.1 can be further put in the fol- 
lowing general mathematical form, Zhang et al. [1995]. 


(4.6) 


where (f> is the generalized variable, S is the volumetric source, and T is the 
diffusion coefficient. The index 7 is set to zero if Cartesian coordinates are used, 
and is unity for polar coordinates. Equation 4.6 can be used for the entire multi- 
phase, multicomponent domain with the provision to account for local properties 
and abrupt changes in transport properties across the zone boundaries, and their 
possible movements. The values for (/>, F and S for continuity, inomentuni, energy 
and concentration equations for an axisymmetric geometry are given in Table 4.2. 
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Table 4.2: 

Variables of Equation 4.6 in an 

axisymmetric geometr}^ 

Equations 

Variable 

Diffusivity F 

Source term S 

continuity 

1 

0 

0 

^-momentum 

u 


-dP/dz -f GrPiPiO 

r-momentum 

V 

fii 

-dPjdr - jliv/r'^ 4- ptw^/r 

j’lu-momentum rw 

fii 

— (2/Ii/r)5(riy)/5r 

0-energy 

CpiO 


0 

concentration 

C 

1/Sc 

0 


4.3 Initial and boundary conditions 

The governing equations given in above sections require both initial and boundary 
conditions for a complete solution. Detail of these conditions is summarized 
below. 

4.3.1 Initial Conditions 

Initial conditions for the set of governing equations depends on the solution strat- 
egy adopted to get the complete solution. In case, a steady-state solution is sought 
any initial field can be taken as the initial condition. However, for transient anal- 
ysis exact initial conditions need to be used. These aspects are discussed in 
Chapter 5 along with the solution strategy. 


4.3.2 Boundary conditions 

Boundary conditions at various boundaries and interfaces in a Cz-system are as 
follows. 
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Bottom and side wall of the crucible 

Here for the velocities, no-slip boundary condition is applicable. Which can be 
written as 

u — V = 0, w — RCc?’ (4.7) 

and for the energy equation we have specified temperature at these walls and 
that gives the following nondimensional boundary condition. 

9 = 1 (4.8) 

For the concentration equation free gradient boundary condition is applicable. 

(VC) • n = 0 (4.9) 

Top and side wall of the enclosure 

Again for the velocities, no-slip boundary condition is applicable. That is 

u = V = w = 0 (4-10) 

and for the energy equation we have specified temperature variation at these 
walls. 

0 = specified variation (4-11) 

For the concentration equation free gradient boundary condition is applicable. 

(VC)-n = 0 (4.12) 

Axis of symmetry 

For the portion of the axis that is inside melt following conditions are applicable. 

— = 0, u = 0, ■n; = 0 (4.13) 

dr 
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(4.14) 



Solid portion of the axis of symmetry need only the following. 


(4.15) 


^ = 0 ( 4 . 16 ) 

Melt-crystal interface 

Here for the velocities, no-slip boundary condition is applicable. Which can be 
written as 


u = n = 0, w = Re^r (4.17) 

and for the energy equation we have the fusion temperature T/ at these walls and 
that gives the following nondimensional boundary condition. 


^ = 0 (4.18) 

At the melt-crystal interface, solute conservation gives the following boundary 
condition. 


-^(VC) • n = ^(1 - A:o)Vpuu • n (4.19) 

where ko is the segregation coefficient and Vpun is the nondimensional pull rate. 
Above boundary condition has been derived by equating the segregation flux at 
the interface with the diffusion flux just below it in the melt. Detailed discussion 
of the same can be found in Section 5.8 of Chapter 5. 
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Free surface 

Here with the flat interface assumption, the kinematic and zero shear 
boundary conditions at the free surface gives the following. 


for the swirl velocity 


ri = 0, 



dw _ 
dz 

A flux balance at the free-surface gives 


dT 

'^dz 


dT 

^dz 


+ ear{T^-T^) 


which on nondimensionalization gives the following 




—k 


d9 

dz 


■b Bir,;g(0ig 9oo) 


and for the concentration equation again the following is applicable 




Top and side wall of the crystal 

Here we no slip for the all the velocity components 

u = Us, V = Q, w = Re^r 

and for temperature, the following flux balance 

dO 


, de 

'^dz 


— —k 


dz 


+ Bir,s(0s ~ ^Do) 


stress 

(4.20) 

(4.21) 

(4.22) 

(4.23) 

(4.24) 

(4.25) 


9 


(4.26) 
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4.4 Process parameters and material properties 

Transport phenomena inside the Cz-system, and therefore the crystal quality 
depends on the effect of a number of system parameters. A list of the various 
important parameters inside the Cz-system is given below in Table 4.2. Typical 
values of material properties for Nd:YAG single crystals is given in Table 4.3. 

Table 4.3: Various Process parameters in a Cz-systern. 


Pull rate 

Initial melt height 
Enclosure diameter 
Crucible temperature 
Ambient temperature 


Crystal diameter 
Initial crystal height 
Enclosure height 
Enclosure temperature variation 
Gas pressure 


Crystal rotation 


Crucible rotation 
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Table 4.4; Thermophysical properties of Nci:YAG single crystals where subscripts 
s, I, g and c denote solid-crystal, melt, gas and crucible respectively. 


Description 

Symbol 

Value 

Density (solid) 

Ps 

4300 Kg/m^ 

Density (liquid) 

Pi 

3600 Kg/m^ 

Density (gas) 

Pg 

0.1602 Kg/m^ 

Thermal conductivity (solid) 

k-s 

8 W/(mK) 

Thermal conductivity (liquid) 

k 

4 W/(mK) 

Thermal conductivity (gas) 

hg 

0.139 W/(mK) 

Melting point 

'I'f 

2243 K 

Melt expansivity 

P 

1.8x10-® K-^ 

Viscosity (liquid) 

Pi ■ 

4.68x10-^ Kg/(ms) 

Viscosity (gas) 

Pg 

6.93x10-® Kg/ (ms) 

Heat Capacity (solid) 

('S 

800 J/(KgK) 

Heat Capacity (liquid) 

Cl 

800 J/(KgK) 

Heat Capacity (gas) 

Cg 

1419 J/(KgK) 

Heat of fusion 

AHf 

4.55x10® J/Kg 

Emissivity (solid) 

G 

0.3 

Emissivity (liquid) 

Cl 

0.3 

Emissivity (gas) 

Cc 

0.5 



Chapter 5 


Numerical solution of the partial 
differential equations governing 
Czochralski crystal growth 


In this chapter the numerical methodology for solving the generalized conservative 
form of the governing equations, Equation 4.6, is presented (Banerjee [2004], 
Zhang et al. [1995]). Then, the overall solution approach is discussed along with 
the details of the calculations for crystal pull rate and solute segregation. 

5.1 Coordinate transformation 

A physical problem composed of a domain of irregular shape is difficult to solve 
in a regular orthogonal coordinate system such as Cartesian owing to the fact 
that the boundaries of such domains may not conform to the regular coordinate 
axes. Hence an accurate representation of the system geometry becomes difficult. 
It is in this context that the concept of boundary fitted (body-fitted) coordinate 
system has evolved. When one use such a coordinate transformation, an irregu- 
lar geometry in a physical space (such as Cartesian) is transformed to a regular 
geometry in a computational space wherein boundaries of the transformed ge- 
ometry conform to the coordinate axes and then calculations are performed in 
the computational space. This transformation, which depends on the shape of 
the physical domain of interest, does not necessarily guarantee orthogonality of 
the grid lines in the transformed plane. In fact in most cases, the transformation 
results in a generalized nonorthogonal curvilinear system in which the problem 
is to be solved. 
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Introduction of nonorthogonality is generally detrimental, as the transformed 
governing equations contain extra terms to account for non-orthogonality of the 
coordinate system. However, the fact that an accurate representation of the phys- 
ical domain is possible by this technique, largely outweighs the above mentioned 
limitations. 

The conservation equation 4.6 in a generalized coordinate system (^,?]) can 
be written as: 


r^J 


dt 




d 


where 


(5.1) 


and 


^ 



Jf] — 

Other coefficients are 

= r^h^hrj^/J, 


L 

hr, \dT]) ■ 
ar, = T^hrjh^^/J, 


= r’^Xhrj/J, Pr, = r’^Xh^/J, 


h = K = 


A = X^Xr, + J = Xf^yr, - y^Xr,. 

It is important to note some of the properties of coefficient of a and p. Both 
these coefficients have units of area, and while a’s are always positive; /3’s may 
be positive, negative or zero. The absolute magnitude of P^ is always less than 
and if the coordinates (^, r]) are orthogonal; then P^ = 0, and ot^ = hr,. These 
coefficients can be conceptually regarded as areas. In view of their properties, ct’s 
are referred to as the primary areas and /?’s as secondary areas. The subscripts 
p and r] in these coefficients denote their correspondence to a ^ = constant and 
Tj = constant lines respectively. With this interpretation, the flux across a finite 
volume face may also be thought of as composed of two components. 
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5.2 Grid generation 

As is evident, Cz grov'th systems are composed of zones with \-astly different 
thermophysical properties and internally moving interfaces and free surfaces. To 
perform accurate simulations for these purposes, the grid system in different 
zones need to be generated in such a way that the interfaces separating these 
zones are preserved and coincide with some grid lines, permitting grid points to 
move only along these interfaces. Here for grid generation, MAGG (Multizone 
Adaptive Grid Generation) scheme developed by Zhang et al. [1995] has been 
used. The scheme has been developed based on the variational method and 
minimizes an integral function which is a measure of grid characteristics, namely 
the smoothness, orthogonality, weighted cell area and inertia of the grids. The 
scheme allows grids to move adpatively as the solutions progress, domains change, 
or both. Moreover, the grids are concentrated in the regions of large variations 
in field variables using appropriate weighting functions. 

5.3 Finite volume discretization 



Figure 5.1: Curvilinear finite- Volume grid arrangement. 
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The grids used are a structured trapezoidal mesh, as shown in Figure 5.1. For 
a typical point P, the generalized conservation equation, Equation 5.1, can be 
discretized in the following manner. 


{r^Jp(t))p — {r^'^JpcpY 


-A^Ar? + {a^J^ - - (a^J^ - (5^J^)^Ari 


- ’ ' ' '/ t-'Z'' ri jw~' i 2 ^ 

+( q ', - Pr,<h)sA^ = ir^JS)pA^Ap 


Above equation represents an overall conservation of d in a finite volume in terms 
of its fluxes at the auxiliary nodes, and is referred to as the flux discretization 
equation. The source term S is the original source term plus the terms consisting 
of the grid velocity components, 


dp 

where the grid velocities are defined as 


(5.3) 


_ dy dx dx dy 

dp dt dp dt 

Although in the calculation grids velocity components have been neglected 
and 5 = 5 is used. 

Now to obtain the finite difference form of Equation 5.2 in terms of 0 at the 
centers of the finite volumes. Power Law scheme of Patankar [1980] is used. The 
fluxes using this scheme can be represented as follows. 




dx dy dy dx 
d^ dt d^ dt 


(5.4) 


{a(J^)e = Fe^P + [DeM\F^e\) + (1 -Fe,0 |]](0p - d£;) 
= Fu,0p + [D,nM\P^w\) + [| 0,F,i, l]](0w - 0p) 
= Fn(l>P + p„A(|Fe„i) T [j -F„,0 |]](0 p - M 
Ms = F,cl>p + [F,.4(|Pe,|) + [I 0, F, 1]]{05 - 4>p) 


where 




Fe = (pue)e. 



Pe 


{SY/Tp + S-^/Tp) 


(5.6) 
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and the function A{\Pee\) is defined by Patankar [1080] as 

A{\Pee\) == max [O, 1 - O.ljPel"] (5.7) 

The other coefhcients along faces w, n, s are similar to Equation 5.6. Substi- 
tuting these fluxes from Equation 5.5 into Equation 5.2 we obtain the following 
discretized governing equation in the rj coordinates. 

ap(j)p = aE(t>E + o.w^w + 0-N<pN + b ( 5 - 8 ) 

where 


aE = [D,A{\Pee\) + [| -Fe, 0 Ij] (<^p - 

(5.9) 

aw — [DyjA(\Pe^\) + [[ 0,F^ |]] — 4>p), 

(5.10) 

aM = [DnA(\Pen\) + 0 -Fn,0 I]] {(f>p - M, 

(5.11) 

as = [DsA{\Pes\) + [\ 0,F,1]] (0s-<^p), 

(5.12) 

o P^Wp 
“ At ’ 

(5.13) 

ap = aE T aw T cw + as -1- Op — SpJ A^Ar] 

(5.14) 


The remaining terms from the substitution are included in b. In order to identify 
the origin of these terms b, is expressed as: 

b = bs + bno ( 5 - 15 ) 

where 


6 , = a°p((}‘’p + SJA^Ar] 


(5.16) 
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and 


-(Fe - F^)<ppAr] - (F„ - F,) 0 pA^ 


(5.17) 


In the above discretized formulation the source term is linearized as = Sc + 
Sp(l>p. Terms corresponding to the grid velocity components (namely, the rate of 
change of grid position with time) have been neglected. 


5.4 Treatment of pressure- velocity coupling 

Staggered grids have commonly been used for transport problems and are con- 
sidered to be more appropriate for pressure- velocity coupling. However, there are 
many advantages of using a non-staggered grid system for the present types of 
problem. For example, the imposition of interfacial boundary conditions in a non- 
staggered grid is more appropriate. The only drawback of this approach is that 
it requires a higher order interpolation to calculate fluxes and handle pressure- 
velocity coupling. The solution algorithm used for fluid flow calculations in the 
general curvilinear coordinate system is basically similar to the SIMPLER al- 
gorithm Patankar [1980], which consist of solving a pressure pressure equation 
to obtain the pressure field and solving a pressure-correction equation to correct 
the predicted velocities. However the scheme is more complicated because the 
velocity directions change continually along the coordinate lines. 


5.5 Overall solution scheme for solving the com- 
plete set of equations 

The overall solution scheme for solving the complete set of governing equations 
for Czochralski crystal growth is as follows. The growth process is simulated by 
a series of quasi-static solutions for flow and temperature fields and the solute 
transport equation is solved in a time marching way for the whole growth period. 
This approach is based on the observation that the variation of flow and tem- 
perature fields is fairly slow with the growth process and therefore, a quasi-static 
solution shall catch the basic features of the process. On the other hand, the 
solute partitioning at the melt-crystal interface and its transport in the melt are 
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intrinsically dynamic processes which are strongly time dependent. Therefore, 
a transient solution need to be obtained to track the solute partitioning and its 
redistribution in the melt during growth. The quasi-static solution is obtained 
for the flow field at a selected crystal height and then, based on this flov/ field, the 
concentration equation is solved in a time marching way. By performing a series 
of such quasi-static calculations, a period of growth process can be simulated and 
influences of various related factors can be investigated. 

5.6 Numerical treatment of moving interfaces 



Figure 5.2: Movement of melt-crystal interface and free-surface. 


Along with the solution of the field equations (mass, momentum and energy), 
motion of the melt-crystal interface and the free-surface need to be determined. 
Solidification of the melt is modeled as a pure substance with a fixed fusion 
temperature T/. This implies that the solid and liquid phases are separated by a 
sharp interface, s{z,r,t) = H{r,t) ~ z = 0, where H is the dimensionless height 
of the melt-crystal interface, Figure 5.2. Energy balance at the interface (Stefan 
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condition) defines its position and motion and can be written as 


dt 


- Upuii (Sj ■ n) = 


Stei 

Pr, 




(5.1S) 


where Upuii is dimensionless crystal pull velocity in z-direction. 

Pull velocity calculation is based on the assumption that the crystal and 
melt are not separated at the tri-junction. Uiow is the dimensionless free-surface 
velocity in —e^ direction and U^oi is the dimensionless solidification velocity in 
— n direction, Figure 5.2. Now, melt-crystal attachment assumption gives the 
following at the tri-j unction. 


[Usoi(e, ■ n)]|^ = 4- (5-19) 


Overall mass conservation for solidification gives the following condition. 


UiowIt = [Usoiiez ■ n)]|j. (5.20) 

where Rr is the radius ration (crystal-to-crucible). Solving above three equations 
gives the following expression for the crystal pull velocity. 


Upuii = -(1 - PsRr^) 


f Ste; 

lf7 




dn 


DBA . 


n) 


(5.21) 


5.7 Solute segregation 

During the crystal growth process, at the melt-crystal interface, there is a ten- 
dency for some of the solutes to remain in the melt and for the others to prefer 
the solid. This phenomenon causes the concentration of the solutes to accumu- 
late or decrease and is termed as solute segregation. However, convection in the 
melt produces mixing and alters the characteristics of the diffusion layer adjacent 
to' the interface. The spatial structure and intensity of the flow determine the 
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axial and lateral (perpendicular to the growth direction) profiles of the solute 
concentration in the crystal. 

Solute concentration of the solid phase is different from the liquid phase due 
to segregation. The equilibrium segregation during the solidification of a binary 
system can be determined from the corresponding phase diagram. In many cases, 
for low solute concentration the solidus and liquidus curves on a phase diagram 
can be considered as straight lines in the vicinity of melting temperature. This 
implies that the ratio of the solubility of a solute in the solid CT to that in the 
melt Cl, remains constant over a concentration range. This ratio is referred as 
the equilibrium segregation coefficient: 


k 


(5.22) 


Segregation coefficient for the considered Nd:YAG system is 0.2, Lu et al. 

[ 2000 ]. 



Figure 5.3; Solute Segregation Through the Melt-Crystal Interface. 


5.8 Solution of the solute transport equation 

During the transient solution of the solute transport equation, initial dimension- 
less concentration of Nd in the melt is taken as initial condition and boundary 
conditions used are as given in Section 4.3 of Chapter 4. Further elaboration on 
the boundary condition at the melt-crystal interface is as follows. 

Actually here the argument is that if a melt of solute concentration C* so- 
lidifies then according to the definition of equilibrium segregation coefficient, a 
solid crystal of concentration koC* will be formed. Since ko is less than one, some 
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of the solute must come back in the melt through melt-crystal interface. This 
phenomenon is known as solute segregation and segregation flux at melt-crystal 
inteiface can be calculated using Pick’s law of diffusion. Thus in dimensional 
form we have the following equation. 


-£)(VC*) • n = C*(l - /:o)V;,„ • n (5.23) 

and this on non-dimensionalization gives the following. 

-^(VC) ■n=.C(l-fco)Vpu„-n (5.24) 

where ko is the segregation coefficient and Vpun is the nondimensional pull rate. 

Two major problems are encountered during the solution of the solute trans- 
port equation . The first is the implementation of the above segregation boundary 
condition at the melt-crystal interface. In a solidification process, solute is re- 
jected into the melt if the segregation coefficient is less than unity. This increases 
the solute concentration in the melt near the interface. To address this issue, 
the approach of introducing an internal source term to simulate the segregation 
phenomenon has been followed. The solute is rejected as a flux, (1 - k)VpuiiC 
and we have converted this flux to a volumetric source in the first control volume 
cells in the melt near the interface, such that for any of such cells: 

solute input rate due to segregation = solute generation rate within the 

cell due to the source term 

This gives source term strength within a cell as 


Sceii^C{l-ko)\UiX^^ (5.25) 

where A is the non-dimensional area, vol is non-dimensional volume of the cell 
and Vpuii is the vertical component of pull rate, Vpuii. The possible error that 
is introduced because of this approach is minimal in this model since the grid 
height near the interface is very small due to adaptive nature of the grid and its 
clustering. 
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Another issue in solving the mass transport equation is to control the total 
mass in the system. During the growth, the melt domain continuously drops 
as more and more melt is solidified into the crystal. This is treated by simply 
regenerating a new grid system in the changed domain. Therefore it cannot 
guarantee the total mass conservation after each new grid system is generated. 
In addition, concentration interpolation is required to start calculations on the 
new grid. Although a very little imbalance is introduced each time, a large 
effect in mass deficit may be produced over a long growth period because of the 
accumulating effect. To avoid this difficulty the approach of conserving solute 
mass by proportionally correcting the solute concentrations in each grid has been 
followed. 


5.9 Code Validation and Grid Independence 

The computer code has been extensively validated, for example against the nu- 
merical results of Kobayashi [1978] and Prasad et al. [1997]. These code vali- 
dation and grid independence studies have been reported in the progress report 
of BRNS project entitled, “Mathematical Modeling of the CVD and Czochralski 
Crystal Growth Systems, Role of Magnetic Fields, and their Effects on Thermo- 
mechanical Properties”. Grid independent solutions were obtained on a 180x60 
mesh. 


5.10 Results and Discussion 

Using the solution scheme discussed in Sections 5.5 and 5.8 following numerical 
results have been obtained for the growth of Nd:YAG single crystals. 

1. Quasi-static temperature and flow fields in the melt at three melt heights 
and crystal rotations (Figure 5.7-Figure 5.9). 

2. Variation in pull- velocity of a continuously growing crystal for different melt 
heights and crystal rotations (Figure 5.10). 

3. Variation in concentration field in the melt with time for different crystal 
rotations (Figure 5.11-Figure 5.16). 
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4. Radial variation in solute concentration at different times for different crys- 
tal rotations (Figure 5.17-Figure 5.18). 

Moieo\'ei, the following values of various important dimensionless parameters 
have been used. 

Crucible radius = 1; 

Enclosure radius = 1; 

Crystal radius = 0.5; 

Initial crystal-height = 0.1; 

Total height of the domain in axial direction = 3.0; 

Initial melt-height = 1; 

Crucible rotation = 0; 

Crucible temperature = 1; 

Ambient temperature = —5.06. 

The enclosure wall temperature drops linearly from crucible temperature to am- 
bient temperature towards top. 

The results have been obtained using Preconditioned Conjugate Gradient 
(PCG) as the linear system solver. Results obtained were found to be exactly 
identical to the Gauss-Seidel as well as the Line-by-line TDMA iterative solvers. 

Figures 5.7 to 5.9 show the effect of crystal rotation on buoyancy driven con- 
vection in the melt. When there is no crystal rotation (case (a) in Figures 5.7 to 
5.9), flow is completely driven by buoyancy and counter clockwise convective rolls 
are initiated along the crucible wall. Superposition of crystal rotation with buoy- 
ancy driven natural convection (case (b) and (c) in Figures 5.7 to 5.9) brings a 
two cell pattern in the flow field. As the crystal rotation rate increases, the clock- 
wise convective rolls initiated due to crystal rotation push the natural convective 
currents towards the wall. Pumping action of crystal rotation makes the hot fluid 
near bottom move up towards crystal and thus isotherms in the temperature field 
shift upward towards the crystal. 

In Figure 5.10, variation in pull velocity of a continuously growing Nd:YAG 
crystal for two different initial melt heights is shown. Both the plots show a 
continuous decrease in pull velocity with time until it becomes zero. As the 
crystal grows, melt height goes down. In these simulations, this lowering in melt 
height is very-very small as compared to the increase in crystal height. Thus 
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dining flow and temperature flelds in the melt changes almost negligibly while 
radiation and convection losses from crystal surface continuously increase. At 
the melt cr 3 ^stal interface, melt side heat flux remains almost same while solid 
side flux increases continuou.sl 3 ' with increasing crystal height. Tiiis makes the 
difference between the two fluxes decrease continuously till it becomes zero. Since 
rate of solidification is proportional to this flux imbalance, pull velocitj' decreases 
continuously and finally becomes zero. 

Variation in concentration field in the melt with time for different crystal 
rotations has been shown in Figures 5.11 to 5.16. Concentration contours shown 
are for segregation coefficients, ka - Q (Figures 5.11 to 5.13) and ko - 0.2 (Figures 
5.14 to 5.16). No change in concentration of the melt takes place for ko = 1. 
Due to segregation concentration of the solute increases near the melt-cr 3 ^stal 
interface and this solute is then redistributed in the melt by convection currents. 
Concentration contours show how solute redistribution is caused by complex flow 
patterns in the melt. Rate of melt enrichment is maximum for ko = 0 as all the 
solute in the solidifying melt is rejected. Rejection for ko = 0.2^ corresponds to 
the Nd;YAG case. 

Figure 5.17 and 5.18 shows radial distribution of solute just below the melt- 
crystal interface. It can be seen that for no crystal rotation, radial variation 
in concentration is maximum with maximum concentration at the center of the 
crystal. This is simply a consequence of the purely buoyancy driven flow which 
makes the solute move towards crj^stal-center. Now as the crj'Stal rotation is 
increased concentration profile starts becoming fiat due to the centrifugal action 
at the melt-crystal interface. 


value of ko = 0.2 has been found in the experiments of Ln [2000] 
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Figure 5.5: Temperature (isotherms on the left) and flow field (streamlines on 
the right) inside the Cz-system for Re = 250, Gr = 1 x 10^ and melt height = 1. 






Figure 5.7: Temperature (isotherms on the left) and flow field (streamlines on 
the right) in Nd:YAG-melt for three different crystal rotations (z.e., (a): Re = 0; 
(b): Re = 250 and (c): Re = 500). For all three crystal rotations, melt height 
and Grashof number are 1 and 1 x 10'* respectively. 
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Figure 5.8: Temperature (isotherms on the left) and flow field (stream _ ^ 

the right) in Nd;YAG-melt for three different crystal rotations (i.e., (a): Re ~ U; 
(b)- Re = 250 and (c): Re = 500). For all three crystal rotations, melt height 

and Grashof number are 0.75 and 1 x 10'‘ respectively. 





Figure 5.9: Temperature (isotherms on the left) and flow field (streamlines on 
the right) in Nd:YAG-melt for three different crystal rotations (i.e., (a): Re = 0; 
(b): Re = 250 and (c): Re = 500). For alt three crystal rotations, melt height 
and Grashof number are 0.5 and 1 x 10‘^ respectively. 
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Figure 5.10; Variation in pull velocity of a continuously growing Nd;YAG crystal 
for two different initial melt heights, ?.e., (a): 1 and (b): 0.5. In both the cases, 
initial crystal height is 0.1 and Gr = 1 x 10^. Re is Reynolds number for the 
crystal rotation. 



Figure 5.11; Iso-concentration lines for Nd concent:.'., 
ent values of time, ie., (a) 0.01, (b) 0.05, (c) 0.1, (o_ 
variable is taken as {C — Co) x 10^, where Co(= 1) 
Other parameters are Gr = 1 x lO**, Re = 0, segi',-, 
pull rate = 3 mm/h. 


• n the melt at six differ- 
''■) i and (f) 5. Contour 
■'ll .solute concentration, 
'coefficient ko — 0 and 
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Figure 5.12: Iso-concentration lines for Nd concentration in the mdl “t six differ- 
J values of time, i.e., (a) 0.01, (b) 0.05, (c) 0.1, (d) 0.5, (e 1 and (f) 5. Contom 
variable Is taken as (C - C.) x 10‘. where C„(= 1) is initial soWe 
Other parameters are Gr = 1 x lOMte = 250. segregation coefficient k, - 0 and 

pull rate = 3 mm/h. 
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Figure 5.13: Iso-concentration lines for Nd concentration in the melt at six differ- 
ent values of time, i.e., (a) 0.01, (b) 0.05, (c) 0.1, (d) 0.5, (e) 1 and (f) 5. Contour 
variable is taken as (C - Co) x 10®, where Co(= 1) is initial solute concentration. 
Other parameters are Gr = 1 x 10'^, Re = 500, segregation coefficient = 0 and 
rate = 3 mm 
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Figure 5.14; Iso-concentration lines for Nd concentration in the melt at six differ- 
° 1 rj.' ■ n ni ('hi n fcl 0 1 fdl 0 5 , (e) 1 and (f) 5. Contour 

ent values of time, z.e., (a) 0.01, fbj U.Uo, icj u.i, w 

1 where Cn(= 1) is initial solute concentration. 

variable IS taken as (G - Ooj X iu , wneie c/QV j.; ; __ n o 

Other parameters are Gr - 1 x 10^ Re = 0, segregation coefficient ko - 0.2 and 
pull rate = 3 mm/h. 
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Figure 5.15; Iso-concentration lines for Nd concentration in the melt at six differ- 
mt values ot time, i.e., (a) 0.01, (b) 0.05, (c) 0.1, (d) 0.5. (e 1 and (f) .m Cmtcm 
variable is taken as (C - Co) x 10®, where Co(= 1) is mitial solute concentr ^ 
Other parameters are Gr = 1 x 10*, Re = 250, segregation coefficient ho = 0.2 

and pull rate = 3 mm/h. 






Figure 5.16: Iso-concentration lines for Nd concentration in the melt at six differ- 
ent values of time, i.e., (a) 0.01, (b) 0.05, (c) 0.1, (d) 0.5, (e) 1 and (f) 5. Contour 
variable is taken as (C- Co) x lO’k where Cq(= 1) is initial solute concentration. 
Other parameters are Gr = 1 x 10^, Re = 500, segregation coefficient /cq = 0.2 
and nnll rate = 3 mm/h. 
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Figure 5.17: Radial concentration profiles at different times for three different 
values of crystal rotations. Other parameters are, initial melt height = 1, Gr = 
l x 10^, segregation coefficient ko — 0 and pull rate = 3 mm/h. 
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Figure 5.18: Radial concentration profiles at diflPerent times for three different 
values of crystal rotations. Other parameters are, initial melt height = 1, Gr = 
1 X 10^, segregation coefficient ko = 0.2 and pull rate = 3 mm/h. 






Chapter 6 


Conclusions and Scope for Future 
Work 

6.1 Conclusion 

The Preconditioned Conjugate Gradient (PCG) algorithm has been tested for 
various benchmark problems in heat transfer. Variation in growth rate, melt 
convection and solutal transport have been studied for the growth of Nd:YAG 
single crystals by the Czochralski process. Following major conclusions have 
been arrived at in the present work. 

1. CPU times for the test case of unsteady-nonlinear heat conduction show 
that PCG is extremely effective for matrix inversion as compared to Gaus- 
sian elimination and the Gauss-Seidel technique. Moreover, PCG becomes 
an even more attractive choice on finer grids. 

2. PCG is well-suited for matrix inversion in advection-diffusion problems. 
The results obtained for Nusselt number in channel flow match the analyt- 
ical solution. 

3. Eigenvalue spectrums of the original and preconditioned matrices show that 
preconditioning with incomplete lower-upper (ILU) decomposition is very 
effective. Moreover, it is general in the sense that it can be constructed and 
applied for any nonsingular coefficient matrix. 

4. In the present simulation for crystal growth, crystal pull rate is varied to 
grow constant diameter crystals. The results obtained for puli- velocity vari- 
ation show that it is not possible to maintain favorable growth conditions 
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for a long period of time. To grow longer constant diameter crystals, pa- 
rameters such as crucible temperature and enclosure temperature need to 
be varied. 

5. Redistribution of the segregated solute in the melt takes place because of 
the complex flow patterns in the melt. Moreover, it is not possible to grow 
crystals with uniform radial solute distribution without any crystal rotation. 

6.2 Scope for Future Work 

The present work has opened up new areas of exploration. Following challenges 
are recommended for future researchers. 

1. An extensive assessment of various preconditioning techniques for PCG is 
required. 

2. Attractive parallel properties of the effective PCG algorithm calls for the 
development of appropriate algorithms. 

3. An extensive study of the efltect of various parameters on crystal growth is 
needed to provide guidelines for growing high quality single crystals. 

4. Exact prediction of solute incorporation into the grown crystals needs a 
microscopic model for solute segregation. 



References 


131 


[10] Golub, G. H. and O’Leary, D. P., Some history of the conjugate gradient 
and Lanczos methods, SIAM Review, Vol. 31, pp. 50, 1989. 

[11] Golub, G. H. and Ortega, J. M., Scientific Computing: An Introduction 
with Parallel Computing, Academic Press, San Diego, 1993. 

[12] Hestenes, M. R. and Stiefel, E. L., Methods of conjugate gradients for 
solving linear systems. Journal of Research of National Bureau of Standards, 
Vol. 49, pp. 409, 1952. 

[13] Kershaw, D. S., The incomplete Cholesky conjugate gradient method for 
iterative solution of sj'’stem of linear equations. Journal of Computational 
Physics, Vol. 26, pp. 43, 1978. 

[14] Kobayashi, N., Computational simulation of the melt flow during Czochral- 
ski growth. Journal of Crystal Growth, Vol. 43, pp. 357-363, 1978. 

[15] Kopetsch, H., Numerical simulation of the Czochralski bulk flow of Silicon 
on a domain confined by a moving crystal-melt interface and a curved melt- 
gas meniscus , Physicochemical Hydrodynamics, Vol. 11, No. 3, pp. 357-375, 
1989. 

[16] Lanczos, C., Solution of system of linear equations by minimized iterations, 
Journal of Research of National Bureau of Standards, Vol. 49, pp. 33, 1952. 

[17] Leonard B. R, A Stable and Accurate Convective Modeling Procedure 
based on Quadratic Upstream Interpolation, Comput. Methods Appl. Mech. 
Eng., Vol. 19, pp. 59-98, 1979. 

[18] Lu, J., Prabhu, M., Song, J., Li, C., Xu, J., Ueda, K., Kaminskii, A. A., 
Yagi, H. and Yanagitani, T., Optical properties and highly efficient laser 
oscillation of NdYAG ceramics. Applied Physics, Vol. B 71, pp. 469-473, 
2000 . 

[19] Meijerink, J. A. and Vorst, H. A. van der. Ah iterative solution method 
for linear systems of which the coefficient matrix is a symmetric M-matrix, 
Mathematics of Computations, Vol. 31, pp. 148, 1977. 



References 


132 


[20] Meijerink, J. A. and Vorst, H. A. \'an der, Guidelines for the Usage of 
Incomplete Decompositions in Solving Sets of Linear Equations as They 
Occur in Practical Problems, Journal of Computational Physics^ Vol. 44, 
pp. 134-155, 1981. 

[21] Minkowycz, W. J., Sparrow, B. M., Scheider, G. E., Pletcher, R. H., Hand- 
book of Numerical Heat Transfer, J, Wiley k Sons Inc., U.S.A., 1988. 

[22] Patankar, S. V., Numerical Heat Transfer and Fluid Flow, Hemisphere, 
Washington, DC, 1980. 

[23] Pissanetzky, S., Sparse Matrix Technology, Academic Press, New York, 
1984. 

[24] Prasad, V., Zhang, H. and Anselmo, A.P., Transport phenomena in 
Czochralski crystal growth processes. Advances in Heat Transfer, Vol. 30, 
pp. 313-435, 1997. 

[25] Reid, J. K., On the method of conjugate gradients for the solution of large 
sparse systems of linear equations, Large Sparse Sets of Linear Equations, 
edited by J. K. Reid, pp. 231, Academic Press, New York, 1971. 

[26] Saad, Y., Iterative Methods for Sparse Linear Systems, PWS Publishing 
Co., Boston, 1996. 

[27] Sheorey, T., Modeling of Enhanced Oil Recovery from a Porous Forma- 
tion, Doctoral dissertation, Indian Institute of Technology Kanpur, Kanpur, 
2001 . 

[28] Sheorey, T., Muralidhar, K. and Mukherjee, P. P., Numerical Experiments 
in the Simulation of Enhanced Oil Recovery, International Journal of Ther- 
mal Sciences, Vol. 40(11), pp. 981-997, 2001. 

[29] Sheorey, T., and Muralidhar, K., Application of Domain Decomposition 
to the Simulation of Oil Recovery on a Parallel Computer, Institution of 
Engineers Journal, Chemical Engineering Division, Vol. 82, pp. 24-29, 2001. 



References 


133 


[30] Sheorey, T., Muralidhar, K. and Mukherjee, P. P., Isothermal and Non- 
isothermal Oil-water Flow and Viscous Fingering in a Porous Medium, In- 
ternational Journal of Thermal Sciences, Vol. 42(7), pp. 665-676, 2003. 

[31] Tannehill, J. C., Anderson, D. A. and Pletcher, R. H., Computational Fluid 
Mechanics and Heat Transfer, Second Edition, Taylor k. Francis, Washing- 
ton, 1997, 

[32] Thomsan, J. F., Warsi, Z. U. A. and Martin, C. W., Numerical Grid 
generation-Foundation and applications, North Holland publishers, 1985. 

[33] Turkel, E., Preconditioning Techniques in Computational Fluid Dynamics, 
Annual Review of Fluid Mechanics, Vol. 31, pp. 385-416, 1999. 

[34] Van de Velde, E. F., Concurrent Scientific Computing, Springer- Verlag, 
New York, 1994. 

[35] Watkins, D. S., Fundamentals of Matrix Computations, John Wiley Sc sons, 
Inc., New York, 2002. 

[36] Zhang, H. and Prasad, V., A multizone adaptive process model for low 
and high pressure crystal growth. Journal of Crystal Growth, Vol. 155, pp. 
47-65, 1995. 

[37] Zhang, H, Prasad. V., and Moallemi, M.K., Numerical Algorithm using 
Multizone Adaptive Grid Generation for multiphase transport processes 
with moving and free boundaries. Numerical Heat Transfer, Part B, Vol. 
29, pp. 399-421, 1996. 

[38] Zou, Y. F., Wang, G, X., Zhang, H., Prasad. V. and Bliss, D.F., Macro- 
segregation, dynamics of interface and stresses in high pressure LEG grown 
crystals, Journal of Crystal Growth, Vol. 180, pp. 524-533, 1997. 



